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TITLE: Three Dimensional Structure of a Sterile Alpha Motif Domain 
FIELD OF THE INVENTION 

The invention relates to the three dimensional structure of a sterile alpha motif (Sam) 
domain. The atomic coordinates that define the structure and any compounds bound to the structure 
5 enable the determination of homologues, the three dimensional structures of polypeptides with 
unknown structure, and the identification of modulators of a SAM domain. 
BACKGROUND OF THE INVENTION 

The Eph family of receptor tyrosine kinases have been implicated in the control of axon 
guidance [Henkemeyer, 1996; Orioli, 1996], cell migration [KrulL 1997], patterning of the nervous 

10 system [Xu, 1996) and angiogenesis [Wang, 1998), and are activated by clustering into dimers or 
tetramers [Stein, 1998 J. However, the cell-surface ligands for Eph receptors (ephrins) apparently lack 
an intrinsic ability to induce receptor oligomerizalion (Lackmann, 1997). Factors that influence 
receptor aggregation include the pre-clustcring of ephrins [Davis, 1994), the homotypic interaction 
between the extracellular domains of two receptor chains [Lackrnann, 1998), and the binding of PDZ 

1 5 domain containing proteins to the receptor's C-terminus [Hock, 1 998). 

All Eph receptors have a Sterile Alpha Motif (SAM) domain within their cytoplasmic 
regions. The SAM domain was identified as a conserved sequence present in a small set of yeast 
sexual differentiation proteins referred to as the Sterile Alpha Mating factors fPonting, 1995: Schurrz, 
1997}. In ETS family transcription factors this sequence has also been termed the Pointed domain 

20 [Klambt, 1993). The domain is found in a variety of proteins, many of which contain catalytic 
domains or recognized protein interaction domains. SAM domains are almost always located at a 
protein's N- or C-terminus. A highly conserved SAM domain is located in the cytoplasmic region of 
Eph receptors (approx. 50 % identity over 14 family members), C-termmal to the catalytic domain 
and followed by only 5 residues that form a potential PDZ domain binding site [Hock, 1998). 

25 Amongst receptor tyrosine kinases, the presence of a cytoplasmic module other than the protein 
kinase domain is unique to Eph receptors. 

The SAM domain can function as a protein interaction module through an ability to homo- 
and hetero-dimerize with other SAM domains [Jousset, 1997; Peterson, 1997; Tu, 1997; Kyba, 1998). 
This dimeriztng property elicits oncogenic activation of chimeric proteins arising from translocation 

30 of the SAM domain of TEL to coding regions of the 0PDGF receptor [Golub, 1994); Abl fGolub, 
1996), and JAK2 protein kinases [Lacronique, 1997) or the AML1 transcription factor [Gohib : 1995). 
A functional role in mediating homo and hetero-typic dirnerization has been shown for SAM domains 
in the transcription factor TEL [Jousset, 1997], members of the poly comb group of transcriptional 
repressors (RAE28, Scm and ph) [Peterson, 1997], the protein kinase Byr2p [Tu, 1997), and the a 

35 and 0 isoforms of the liprin scaffolding proteins [Sena-Pages, 1 998). 
SUMMARY OF THE INVENTION 

Broadly stated, the present invention relates to the three-dimensional structure of one or 
more SAM domains. The three-dimensional structures may be complexed with one or more 
compounds. The defined boundaries and properties of the structures and any of the compounds bound 
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to it are pertinent to methods for determining the three-dimensional structures of polypeptides with 
unknown structure, and to methods that identify modulators of SAM domain function. These 
modulators are potentially useful as therapeutics for diseases, including (but not limited to) cell 
proliferative diseases, such as cancer, angiogenesis, atherosclerosis, and arthritis, and diseases 
associated with the nervous system. 

Broadly stated the present invention relates to a crystalline form of a polypeptide 
corresponding to one or more SAM domains, preferably one or more SAM domains of an Eph 
receptor, preferably ofEphA. The crystalline form may comprise one or more heavy metal atoms, or 
at least one compound. In a preferred embodiment, a unit cell of the crystalline form of the invention 
has dimensions of about a=b= 77.14 ± .03 angstroms, c= 24.3 ± .04 angstroms. 

The invention also relates to a method of forming a crystalline form of the invention 
comprising 

20 ( a ) mixing a volume of a SAM domain with a reservoir solution; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution in a closed 
1 5 container under conditions suitable for crystallization. 

The invention also features a method of determining three dimensional structures of 
polypeptides with unknown structure comprising the step of applying the structural atomic 
coordinates of a crystalline form of one or more SAM domains of the invention. 

Methods are also provided for identifying a potential modulator of a SAM domain function 
preferably a SAM domain of an Eph receptor function by docking a computer representation of a 
structure of a compound with a computer representation of a structure of one or more SAM domains 
of the invention preferably a SAM domain of an Eph receptor that is defined by the atomic structural 
coordinates described herein. In an embodiment the method comprises the following steps: 

(a) docking a computer representation of a compound from a computer data base with 
25 a computer representation of a selected site on a SAM domain, preferably a SAM 

35 domain of an Eph receptor, to obtain a complex ; 

(b) determining a conformation of the complex with a favourable geometric fit and 
favourable complementary interactions; and 

(c) identifying compounds that best fit the selected site as potential modulators of 
30 SAM domain function. 

In another embodiment the method comprises the following steps: 

(a) modifying a computer representation of a compound complexed with a selected site 
on a SAM domain, preferably a SAM domain of an Eph receptor, by deleting or 
adding a chemical group or groups; 

(b) determining a conformation of the complex with a favourable geometric fit and 
favourable complementary interactions; and 

(c) identifying a compound that best fits the selected she as a potential modulator of a 
SAM domain. 

50 In still another embodiment the method comprises the following steps: 
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(a) selecting a computer representation of a compound complexed with a selected she 
on a SAM domain, preferably a SAM domain of an Eph receptor, and 

(b) searching for molecules in a data base that are similar to the compound using a 
searching computer program, or replacing portions of the compound with similar 

5 chemical structures from a data base using a compound building computer 

program. 

The invention also features a potential modulator of a function of a SAM domain preferably 
a SAM domain of an Epb receptor identified by the methods of the invention, and a method of 
15 treating a disease associated with a SAM domain preferably a SAM domain of an Eph receptor with 

1 0 inappropriate activity in a cellular organism, comprising: 

(a) administering a modulator identified using the methods of the invention in an 
acceptable pharmaceutical preparation; and 
20 (t>) activating or inhibiting a SAM domain function to treat the disease. 

The invention also provides peptides that mediate SAM domain function. 
15 BRIEF DESCRIPTION OF THE DRAWINGS 

The invention will now be described in relation to the drawings in which: 
Figure 1A shows a sequence alignment of SAM domains from selected proteins (SEQ. ID. 
NOS. 1 to 21); 

Figure IB shows a selection of multi-domain proteins containing SAM domain (S); 
20 Figure 2A is a ribbons depiction of the SAM homo-dimer viewed down the twofold 

symmetry axis; 

Figure 2B is a ribbons depiction of the SAM homo-dimer viewed perpendicular to the 
symmetry axis; 

Figure 2C is a ribbons stereo view highlighting the dimer interface region; 
25 Figure 3 A is a molecular surface and worm representation of the SAM homodimer; 

35 Figure 3B is a molecular surface and worm representation of the SAM homodimer; and 

Figure 4 is a gel filtration elution profile of wild type and single or double site mutants of the 
EphA4 receptor SAM domain. 

DETAILED DESCRIPTION OF THE INVENTION 
30 DEFINITIONS: 

Unless otherwise indicated, all terms used herein have the same meaning as they would to 
one skilled in the art of the present invention. Practitioners are particularly directed to Current 
Protocols in Molecular Biology (Ansubel) for definitions and terms of the art 

Abbreviations for ammo acid residues are the standard 3-letter and/or 1 -letter codes used in 
45 35 the an to refer to one of the 20 common L-amino acids. Likewise abbreviations for nucleic acids are 

the standard codes used in the art. 

The term "crystalline form** in the context of the invention, is a crystal formed from an 
aqueous solution comprising a purified polypeptide comprising one or more SAM domains, 
50 preferably a SAM domain of an Eph receptor. A crystalline form of a SAM domain is characterized 
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as being capable of diffracting x-rays in a pattern defined by one of the crystal forms depicted in 
Bhuidel et al 1976, Protein Crystallography, Academic Press. A crystalline form may include a 
crystal structure in association with one or more heavy-metal atoms Le. a derivative crystal, or a 
crystal structure in association with one or more compounds i.e. a co-crystal. 
5 The term "association" refers to a condition of proximity between a chemical entity or 

compound or portions or fragments thereof, and a SAM domain or portions or fragments thereof. The 
association may be non-covalent Le. where the juxtaposition is energetically favored by for example, 
hydrogen-bonding, van der Waals, or electrostatic or hydrophobic ineractions, or it may be covalent 
15 The term "heavy-metal atoms" refers to an atom that is a transition element, a lanthanide 

10 metal, or an actinide metal Lanthanide metals include elements with atomic numbers between 57 and 
71, inclusive. Actinide metals include elements with atomic numbers between 89 and 103, inclusive. 

The term "Eph receptor refers to a subfamily of closely related transmembrane receptor 
tyrosine kinases related to Eph, a receptor named for its expression in an erythropoietin-producing 
human hepatocellular carcinomas ceil line. The receptors contain cell adhesion-like domains on their 
1 5 extracellular surface. The Eph subfamily receptor tyrosine kinases are more specifically characterised 
as encoding a structurally related cysteine rich extracellular domain containing a single 
immunoglobulin (Ig>like loop near the N-terminus and two fibronectin DT (FN ill) repeats adjacent 
to the plasma membrane. The Eph receptors are divided into two groups based on the relatedness of 
their extracellular domain sequences. The grouping also corresponds to the ability of the receptors to 
20 bind preferentially to the ephrin-A or ephrin-B proteins. The group that includes receptors interacting 
preferentially with ephrin A proteins is called EphA and includes EphAl (also known as Eph and 
30 EskX EphA2 (also known as Eck, Myk2, Sek2), EphA3 (also known as Cek4, Mek4, Hek, Tyro4, 

Hek4), EphA4 (also known as Sek, Sekl, Cek8, Hek8, TyrolX EphAS (also known as Ehkl, Bsk, 
Cek7, Hek7, and Rek7), EphA6 (Ehk2, and Hek 12) EphA7 (also known as Mdk 1 , Hek 1 1 , Ehk3, Ebk, 
25 Cekll), and EphA 8 (also known as Eek, Hek3). The group that includes receptors interacting 
35 preferentially with ephrin B proteins is called Eph B and includes EphBl (also known as Elk, Cek6, 

Net, Hek6), EphB2 (also known as Cek5, Nuk, Erk, Qek5, Tyro5, Sek3, bek5, Drt), EphB3 (also 
known as Cek 1 0, Hek2, Mdk5, Tyro6, and Sek4), EphB4 (also known as Htk, Myk 1 , Tyro 1 1 , Mdk2X 
EphB5 (also known as Cek9, Hek9), and EphB6 (also known as Mep). 
30 "Ephrin" refers to a class of ligands which are anchored to the cell membrane through a 

transmembrane domain, and bind to the extracellular domain of an Eph receptor, facilitating 
dimerization and autophosphorylation of the receptor and autophosphoryiat km of the ligand. The 
ephrins which are targeted in the methods of the invention are those that bind to and activate (Le. 
phosphorylate) an EphA or an EphB receptor, preferably an EphA receptor. The ephrin-A ligands 
5 35 (GPI-anchored ligands) are ephrin-A (also known as B61, LERK1, EFL-1), ephrin-A2 (also known as 

LEKK6, Elf I , mCek7-L, cElfl ), ephrin- A3 (also known as LERK3, Ehk I-L, and EFL-2X ephrin- A4 
(also known as LERK4, EFL-4, mLERK4), ephrin- AS (ALI, LERK7, EFL-5, mALI, IrLERK7J, 
RAGS), and the ephrin-B ligands (transmembrane ligands) are ephrin-B I (also known as LEKR2, 
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ELK-L, EFL-3, Cek5-L, Stral, (LERK2J), ephrin-B2 (also known as LERK5, HTK-L, NLERK1, 
Elf2. Hlk-L). and ephrin-B3 (also known as LERK8, ELK-L3, NLERK2, EFL-6, ElO, [rELK-L3]). 

The tcnn "SAM domain" refers to a region known as the Sterile Alpha Motif (SAM) domain 
within the cytoplasmic regions of all Eph receptors (Figure IB), and in other proteins such as TEL 
[Jousset, 1 997], members of the polycomb group of transcriptional repressors (RAE28, Son and ph) 
{Peterson, 1997], the protein kinase Byr2p (Tu, 1997], the a and 0 isoforms of the liprin scaffolding 
proteins |Serra-Pages, I998J, and tankyrase (Smith, S. et al, Science 282: 1484-1487, 1998, Acession 
AF082556). The SAM domain was identified as a conserved sequence present in a small set of yeast 
sexual differentiation proteins referred to as the Sterile Alpha Mating factors [Ponting, 1995; Schultz, 
10 1997]. In ETS family transcription factors this sequence has also been termed the Pointed domain 
[Klambt, 1993], Extensive database searching and sequence alignment analysis (Figure 1 A) reveals 
that this domain is found in a variety of proteins, many of which contain catalytic domains or 
20 recognized protein interaction domains (Figure IB). SAM domains are almost always located at a 

protein's N- or C-terminus. A highly conserved SAM domain is located in the cytoplasmic region of 
15 Eph receptors (approximately 50 % identity over 14 family members), C-terminal to the catalytic 
domain and followed by only 5 residues that form a potential PDZ domain binding site [Hock, 1998]. 
The term also includes amino acid sequences having substantial sequence identity to a SAM domain, 
a mutant, or a subunit of a SAM domain. Preferably the SAM domain is an "Eph SAM domain" i.e. 
a SAM domain of an Eph receptor. 
20 "SAM domain structure" or "SAM domain three dimensional structure" refers to the three 

dimensional structure of a purified polypeptide comprising one or more SAM domains, preferably a 
30 crystalline form. 

As applied to polypeptides, the term " substantial sequence identity" means that two peptide 
sequences, when optimally aligned, such as by the programs GAP or BESTFIT using default gap, 
25 share at least 80 percent sequence identity, preferably at least 90 percent sequence identity, more 
35 preferably at least 95 percent sequence identity or more. Preferably, residue positions which are not 

identical differ by conservative amino acid substitutions. For example, the substitution of amino acids 
having similar chemical properties such as charge or polarity are not likely to effect the properties of 
a protein. Examples include giutamine for asparagine or glutamic acid for aspartic acid. 

The term "mutant" refers to a polypeptide that is obtained by replacing at least one amino 
acid residue in a native SAM domain with a different ammo acid residue. Mutation can be 
accomplished by adding and/or deleting amino acid residues within the native SAM domain. A 
mutant may or may not be functional. 

The term "function" refers to the ability of a modulator to enhance or inhibit the association 
45 35 between a SAM domain and a compound. 

The term "atomic structural coordinates" as used herein refers to a data set mat defines the 
three dimensional structure of a molecule or molecules (e.g. unit cell axial lengths, space group). 
Structural coordinates can be slightly modified and still render nearly identical three dimensional 
50 structures. A measure of a unique set of structural coordinates is the root-mean-square deviation of 
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the resulting structure. Structural coordinates that render three dimensional structures that deviate 
from one another by a root-mean-square deviation of less than 1.5 A may be viewed by a person of 
ordinary skill fn the art as identical Structural coordinates for a SAM domain are in Table 2. 

The term "unit ceir refers to the smallest and simplest volume element (i.e. parallelpiped- 
10 5 shaped block) of a crystal that is completely representative of the unit of pattern of the crystal. The 

unit cell axial lengths are represented by a. b, and c where a « x axis, b = y axis, and c = z axis. Those 
of skill in the art understand that a set of atomic coordinates determined by X-ray crystallography is 
not without standard error. 

f 5 The term "space group" refers to the symmetry of a unit cell. In a space group designation 

10 the capital letter indicates the lattice type and the other symbols represent symmetry operations that 
can be carried out on the unit cell without changing its appearance. 

The term "purified" in reference to a polypeptide, does not require absolute purity such as a 
homogenous preparation rather it represents an indication that the sequence is relatively purer than in 
the natural environment. Generally, a purified polypeptide is substantially free of other proteins, 
!5 lipids, carbohydrates, or other materials with which H is naturally associated, preferably at a 
functionally significant level for example at least 85% pure, more preferably at least 95% pure, most 
preferably at least 99% pure. A skilled artisan can purify a polypeptide comprising a SAM domain 
25 using standard techniques for protein purification. A substantially pure polypeptide comprising a Sam 

domain will yield a single major band on a non-reducing polyacrylamidc gel. The purity of the SAM 
20 domain polypeptide can also be determined by ammo-terminal amino acid sequence analysis. 
Three Dimensional Structure of SAM Domain 
30 The present invention provides a purified SAM domain three dimensional structure. In an 

embodiment the structure is a crystalline form. A SAM domain structure may comprise one or more 
SAM domains in a unit cell, preferably two, three or four SAM domains. In a preferred embodiment, 
25 a SAM domain is arranged in a crystallline manner in a space group P6 4 so as to form a unit cell of 
35 dimensions a=b= 77.14 angstroms. c= 24.37 angstroms and which effectively diffracts X-rays for 

determination of the atomic coordinates of the SAM domain to a resolution of about 2.9 angstroms. 
The 3 -dimensional structure of a preferred SAM domain of the invention is shown in Figures 2 and 3. 
A crystalline form includes native crystals, derivative crystals, and co-crystals. The native 
30 crystals generally comprise substantially pure polypeptides comprising one or more SAM domains in 
crystalline form. It is understood that the crystalline form is not limited to naturally occurring or 
native SAM domains but includes mutants of native SAM domains obtained by replacing at least one 
ammo acid residue in a native SAM domain with a different amino acid residue or by adding or 
deleting amino acid residues within the native polypeptide, and having substantially the same three 
45 35 dimensional structure as the native SAM domain from which the mutant is derived i.e. having a set of 

atomic structural coordinates that have a root mean square deviation of less than or equal to about 2A 
when superimposed with the atomic structure coordinates of the native SAM domain from which the 
mutant is derived when at least 50% to 100% of the atoms of the native SAM domain are included in 
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thc superimoosition. It should be noted that the mutants contemplated herein need not exhibit SAM 
domain activity. 

The derivative crystals of the invention generally comprise a crystalline SAM domain in 
covalent association with one or more heavy metal atoms. The SAM domain may correspond to a 
native or mutated SAM domain. Heavy metal atoms useful for providing derivative crystals include 
by way of example, and not limitation gold, mercury, etc. 

The invention features a crystalline form of a SAM domain in association with one or more 
compounds. The association may be covalent or non-covaient. These types of crystalline forms are 
15 referred to herein as co-crystals. The compound may be any organic molecule, and it may modulate 

10 the fimction of a SAM domain by for example inhibiting or enhancing its function, or it may be an 
analogue of a SAM domain. It is preferred that the geometry of the compound and the interactions 
formed between the compound and the SAM domain provide high affinity binding between the two 
20 molecules. High affinity binding is preferably governed by a dissociation equilibrium constant on the 

order of 10"* or less. 
1 5 Method for Preparing Crystal Forms of SAM Domain 

The invention also features a method for creating the crystalline SAM domain structures 
described herein. The method may utilize a polypeptide comprising a SAM domain described herein 
to form a crystal. A polypeptide used in the method may be chemically synthesized in whole or in 
part using techniques that are well-known in the art. Alternatively, methods are well known to the 
20 skilled artisan to construct expression vectors containing the native or mutated SAM domain coding 
sequence and appropriate transcriptionaJ/translational control signals. These methods include in vitro 
recombinant DNA techniques, synthetic techniques, and in vivo recombination/genetic 
recombination. See for example the techniques described in Sambrook et ai (Molecular Cloning: A 
Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory press (1989)), and other laboratory 
25 textbooks. 

35 Crystals are grown from an aqueous solution containing the purified and concentrated SAM 

domain polypeptide by a variety of conventional processes. These processes include batch, liquid, 
bridge, dialysis, vapor diffusion, and banging drop methods. (See for example, McPherson, 1982 
John Wiley, New York; McPherson, 1990, Eur. ). Biochem. 189: 1-23; Webber. 1991, Adv. Protein 
Chem. 41:1-36). Generally, the native crystals of the invention are grown by adding precipitants to 
the concentrated solution of the SAM domain polypeptide. The precipitants are added at a 
concentration just below that necessary to precipitate the protein. Water is removed by controlled 
evaporation to produce precipitating conditions, which are maintained until crystal growth ceases, 
m an embodiment of the invention, the method generally comprises the steps of 

(a) mixing a volume of polypeptide solution with a reservoir solution; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution in a closed 
container, under conditions suitable for crystallization. 

For crystals of the invention, it has been found that hanging drops containing about lul of 
50 SAM domain polypeptide (50-150 mg/ml, preferably 100 mg/ml, in 5-2-mM, preferably 7mM Hepes 
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pH 5.5 to 9, preferably 7 J) and equal volumes of reservoir buffer (50-150 mM, preferably iOOmM 
cacodylate pH 5.5 to 7.5, preferably 6.5; 5-10% preferably 7% (w/v) PEG 8000; and 10-30%, 
preferably 20% (v/v) ethylene glycol) suspended overnight at room temperature provide crystals 
suitable for high resolution X-ray structure determination. It will be appreciated that the above- 
described crystallization conditions can be varied and such variations can be used alone or in 
combination. For example other buffer solutions such as Tris-HCL buffer may be used. 

Derivative crystals of the invention can be obtained by soaking native crystals in a solution 
containing salts of heavy metal atoms. Co-crystals of the invention can be obtained by soaking a 
15 naXive crystal in a solution containing a compound that binds the SAM domain, or they can be 

10 obtained by co- crystallizing the SAM domain polypeptide m the presence of one or more compounds 
that bind to the SAM domain. 

Once the crystal is grown it can be placed in a glass capillary tube and mounted onto a 
20 holding device connected to an X-ray generator and an X-ray detection device. Collection of X-ray 

diffraction patterns are well documented by those skilled in the art (See for example, Ducruix and 
15 Geige, 1992, 1RL Press, Oxford, England). A beam of X-rays enter the crystal and diffract from the 
crystal. An X-ray detection device can be utilized to record the diffraction patterns emanating from 
the crystal Suitable devices include the Maxr 345 imaging plate detector system with an RU200 
rotating anode generator. 

Methods for obtaining the three dimensional structure of the crystalline form of a molecule 
or complex are described herein and known to those skilled in the art (see Ducruix and Geige, supra). 
Generally, toe unit cell dimensions and orientation in the crystal can be determined from the spacing 
30 between the diffraction emissions as well as the patterns made from the emissions. The symmetry of 

the unit cell in the crystal is also determined. Each diffraction partem emission is characterized as a 
vector and the data collected at this stage determines the amplitude of each vector. The phases of the 
25 vectors may be determined by the isomorphous replacement method where heavy atoms soaked into 
35 the crystal arc used as reference points in the X-ray analysis (see for example, Otwinowski, 1991, 

Daresbury, United Kingdom, 80-86). The phases of the vectors may also be determined by molecular 
replacement (see for example, Naraza, 1994, Proteins 11:281-296). The amplitudes and phases of 
vectors from the crystalline form of an Eph SAM domain, preferably an EphA4 SAM domain. 
4Q 30 determined in accordance with these methods can be used to analyze other crystalline SAM domains. 

The unit cell dimensions and symmetry, and vector amplitude and phase information can be 
used in a Fourier transform function to calculate the electron density in the unit cell Le. to generate an 
experimental electron density map. This may be accomplished used the PHASES package (Furry, 
1990). Amino acid sequence structures are fit to the experimental electron density map (ie. model 
35 bunding) using computer programs (e.g. Jones, TA. et al. Acta CrystaUogr A47, 1 00- 1 19, 1991) to 
calculate a theoretical electron density map. The theoretical and experimental electron density maps 
can be compared and the agreement between the maps can be described by a parameter referred to as 
R-factor. A high degree of overlap in the maps is represented by a low value R-factor. The R -factor 
50 can be minimized by using computer programs that refine the theoretical electron density map. For 
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cxample, the XPLOR program, developed by Brunger (1 992, Nature 355:472-475) can be used for 
mode) refinement. 

A three dimensional structure of the molecule or complex may be described by atoms that fit 
the theoretical electron density characterized by a minimum R value. Files can be created for the 
5 structure that defines each atom by coordinates in three dimensions. 
Identification of Homologous 

The knowledge of the three dimensional structure of a SAM domain, in particular the 
EphA4 SAM domain, enables one skilled m the art to identify homologues. This is achieved by 
* 5 searches of truce -dimensional databases. Since structural folds are conserved to a greater extent than 

10 sequence, one may identify homologues with very little sequence similarity. Programs that provide 
this type of database searching are known in the art and include Dali. The structural coordinates of a 
protein structure are submitted and the program performs a multiple structural alignment with 
20 proteins in the protein data bank. 

Methods for Determining Three Dimensional Structures 
15 The structure coordinates of a SAM domain structure described herein can be used as a 

model for determining the three dimensional structures of additional native or mutated SAM domains 
with unknown structure, as well as the structures of co-crystals of SAM domains with compounds 
such as modulators (e.g. agonists or antagonists). The structure coordinates and models of a SAM 
domain three dimensional structure can also be used to determine solution-based structures of native 
20 or mutant SAM domains. 

Three dimensional structure may be determined by applying the structural coordinates of a 
30 SAM domain structure to other data such as an amino acid sequence, X-ray crystallographic 

diffraction data, or nuclear magnetic resonance (NMR) data. Homology modeling, molecular 
replacement, and nuclear magnetic resonance methods using these other data sets are described 
25 below. 

35 Homology modeling (also known as comparative modeling or knowledge-based modeling) 

methods develop a three dimensional model from a polypeptide sequence based on the structures of 
known proteins. In the present invention the method utilizes a computer representation of the three 
dimensional structure of a SAM domain, preferably the EphA SAM domain, more preferably the 

^0 30 EphA 4 SAM domain, or a complex of same, a computer representation of the amino acid sequence of 

a polypeptide with an unknown structure, and standard computer representations of the structures of 
amino acids. The method in particular comprises the steps of; (a) identifying structurally conserved 
and variable regions in the known structure; (b) aligning the amino acid sequences of the known 
structure and unknown structure (c) generating coordinates of main chain atoms and side chain 

45 

35 atoms in structurally conserved and variable regions of the unknown structure based on the 
coordinates of the known structure thereby obtaining a homology model; and (d) refining the 
homology model to obtain a three dimensional structure for the unknown structure. This method is 
well known to those skilled in the art (Greer, 1985, Sceince 228, 1055; Bundell et al 1988, Eur. J. 
50 Biochem. 172, 513; Knighton et al„ 1992, Science 258:130-135. 
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bttp^/bkx^em.vtedu/courses/modcling/botnologyjYtn), Computer programs that can be used in 
homology modeling are Quanta and the Homology module in the Insight U modeling package 
distributed by Molecular Simulations Inc. or MODELLER (Rockefeller University, 
www Jucr.ac.nk/sinris-top/logical/prg-modenerJitm IX 
5 In step (a) of the homology modeling method, the known SAM domain structure (e.g. 

structure of the EphA4 SAM domain) is examined to identify the structurally conserved regions 
(SCRs) from which an average structure, or framework, can be constructed for these regions of the 
protein. Variable regions (VRs), in which known structures may differ in conformation, also must be 
15 identified. SCRs generally correspond to the elements of secondary structure, such as alpha-helices 

1 0 (the four a-helices in the EphA4 SAM domain) and beta-sheets, and to ligand- and substrate-binding 
sites. The VRs usually He on the surface of the proteins and form the loops where the main chain 
turns. 

20 Many methods are available for sequence alignment of known structures and unknown 

structure. Sequence alignments generally are based on the dynamic programming algorithm of 
15 Needleman and Wunsch [J. Mol. Biol. 48: 442-453, 1970). Current methods include FASTA, Smith- 
Waterman, and BLASTP, with the B LA SIT method differing from the other two in not allowing 
gaps. Scoring of alignments typically involves construction of a 20x20 matrix in which identical 
amino acids and those of similar character (i.e., conservative substitutions) may be scored higher than 
those of different character. Substitution schemes which may be used to score alignments include the 
20 scoring matrices PAM (Dayhoff et al., Meth. EnzymoL 91 : 524-545, 1 983), and BLOSUM (Henikoff 
and Henikofr, Proc. Nat Acad. Sci. USA 89: I0915-*09I9, 1992X and the matrices based on 
30 alignments derived from three-dimensional structures including that of Johnson and Overington (JO 

matrices) (J. Mol. Biol. 233: 7 1 6-738, 1993). 

Alignment based solely on sequence may be used, though other structural features also may 
25 be taken into account. In Quanta, multiple sequence alignment algorithms are available that may be 
35 used WDCn aligning a sequence of the unknown with the known structures. Four scoring systems (i.e. 

sequence homology, secondary structure homology, residue accessibility homology, CA-CA distance 
homology) are available, each of which may be evaluated during an alignment so that relative 
statistical weights may be assigned. 
30 When generating coordinates for the unknown structure, main chain atoms and side chain 

atoms, both in SCRs and VRs need to be modeled. A variety of approaches known to those skilled in 
the art may be used to assign coordinates to the unknown. In particular, the coordinates of the main 
chain atoms of SCRs will be transferred to the unknown structure. VRs correspond most often to the 
loops on the surface of the polypeptide and if a loop in the known structure is a good model for the 

45 

35 unknown, then the main chain coordinates of the known structure may be copied. Side chain 
coordinates of SCRs and VRs are copied if the residue type m the unknown is identical to or very 
similar to that in the known structure. For other side chain coordinates, a side chain rotamer library 
may be used to define the side chain coordinates. When a good model for a loop cannot be found 
50 fragment databases may be searched for loops in other proteins that may provide a suitable model for 
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the unknown. If desired, tbe loop may then be subjected to conformational searching to identify low 
energy conformers if desired. 

Once a homology model has been generated it should be analyzed to determine its 
correctness. A computer program available to assist in this analysis is tbe Protein Health module in 
5 Quanta which provides a variety of tests. Other programs that provide structure analysis along with 
output include PROCHECK and 3D-Profiler [Luthy R. et a), Nature 356: 83-85, 1992; and Bowie, 
J.U. et al, Science 253: 164-170, 1991). Once any irregularities have been resolved, the entire 
structure may be further refined Refinement may consist of energy minimization with restraints, 
15 especially for tbe SCRs. Restraints may be gradually removed for subsequent minimizations, 

1 0 Molecular dynamics may also be applied in conjunction with energy minimization. 

Molecular replacement involves applying X-ray diffraction data of a known structure to the 
incomplete X-ray crystallograprnc data set of a polypeptide of unknown structure. The method can be 
20 uscd to define the phases describing tbe X-ray diffraction data of a polypeptide of unknown structure 

when only the amplitudes are known. Commonly used computer software packages for molecular 
15 replacement are X-PLOR (Bnmger 1992, Nature 355: 472-475), AMoRE (Navaza, 1994, Acta 
Crystallogr. A50: 157-163), tbe CCP4 package (Collaborative Computational Project, Number 4, 
The CCP4 Suite: Programs for Protein Crystallography", Acta CrysL, Vol. D50, pp. 760-763, I994X 
and the MERLOT package (PJHiX Fitzgerald, J. Appl. Cryst, VoL 2I T pp. 273-278, 1988). It is 
preferable that the resulting structure not exhibit a root-mean-square deviation of more than 3 A. 
20 The objective of molecular replacement is to align positions of atoms m the unit cell by 

matching electron diffraction data from two crystals. Molecular replacement computer programs 
30 generally involve the following steps: (1) determining the number of molecules in the unit cell and 

defining the angles between them; (2) rotating the diffraction data to define the orientation of the 
molecules in the unit cell; (3) translating the electron density in three dimensions to correctly position 
25 the molecules in the unit cell; (4) determining the amplitudes and phases of the X-ray diffraction data 
35 and calculating an R- fact or calculated from the reference data set and from the new data wherein an 

R-fector between 30-50% indicates that the orientations of the atoms in the unit cell have been 
reasonably determined by the method; and (5) optionally decreasing the R- factor to about 20% by 
refining the new electron density map using iterative refinement techniques known to those skilled in 
30 the art 

In an embodiment of the invention, a method is provided for determining three dimensional 
structures of polypeptides with unknown structure by applying tbe structural coordinates of a SAM 
domain structure to an incomplete X-ray crystallographic data set for a polypeptide of unknown 
structure, and determining a low energy conformation of the resulting structure. 
45 35 The stnictural coordinates of a SAM domain structure may be applied to nuclear magnetic 

resonance (NMR) data to determine the three dimensional structures of polypeptides. (See for 
example, Wuthrich, 1986, John Wiley and Sons, New York: 176-199; Pflugrath et al., 1986, J. 
Molecular Biology 189: 383-386; Kline et al, 1986 J. Molecular Biology 189377-382). While tbe 
50 secondary structure of a polypeptide may often be determined by NMR data, the spatial connections 
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betwecn individual pieces of secondary structure are not as readily determined. The structural 
coordinates of a polypeptide defined by X-ray crystallography can guide the NMR spectroscopist to 
an understanding of the spatial interactions between secondary structural elements in a polypeptide of 
related structure. Information on spatial interactions between secondary structural elements can 
greatly simplify Nuclear Overhauser Effect (NOE) data from two-dimensional NMR experiments. In 
addition, applying the structural coordinates after the determination of secondary structure by NMR 
techniques simplifies the assignment of NOFs relating to particular amino acids in the polypeptide 
sequence and does not greatly bias the NMR analysis of rx>rypcptide structure. 

In an embodiment, the invention relates to a method of determining three dimensional 
structures of polypeptides with unknown structures by applying the structural coordinates of a SAM 
domain structure to nuclear magnetic resonance (NMR) data of the unknown structure. This method 
comprises the steps of: (a) determining the secondary structure of an unknown structure using NMR 
20 data; md ^ simphfying the assignment of through-space interactions of amino acids. The term u 

through-space interactions" defines the orientation of the secondary structural elements in the three 
15 dimensional structure and the distances between amino acids from different portions of the amino 
acid sequence. The term "assignment" defines a method of analyzing NMR data and identifying 
which amino acids give rise to signals in the NMR spectrum. 
Identification of Potential Modulators of SAM Domains 

Modulators of a SAM domain may be designed and identified that may modify the 
20 inappropriate activity of a SAM domain involved in a clinical disorder. The rational design and 
identification of modulators of SAM domains can be accomplished by utilizing the atomic structural 
30 coordinates that define a SAM domain's three dimensional structure. 

Modulators may include substances that bind to or mimic the residues of a SAM domain that 
are required for dimerization of SAM domains. For example, a substance that binds to or mimics the 
25 interface residues of an EphA SAM domain (e.g. Val 913, Val 914, Met 972, Met 976, Met 979, Val 
35 94t ^ and ^ 940), or the proximal residues of an EphA SAM domain (e.g. He 959 to Lys) may 

modify inappropriate activity of a SAM domain involved in a clinical disorder. 

Structure-based modulator design identification methods are powerful techniques that can 
involve searches of computer databases containing a variety of potential modulators and chemical 
4Q 30 functional groups. (See Kuntz et at, 1 994, Acc. Chem. Res. 27: 1 1 7; Guida, ) 994, Current Opinion in 

Srruc. Biol. 4: 777; and Cohnan, 1994, Current Opinion in Struc. Biol. 4: 868, for reviews of 
structure-based drug design and identification ;and Kuntz et al 1982, J. Mol. Biol. 162269; Kuntz et 
a, -» 1994 > A«. Chem. Res. 27: 117; Meng et al., 1992, J. CoropL Chem. 13: 505; Bohm, 1994, J. 
Comp. Aided Molec. Design 8: 623 for methods of structure-based modulator design). 

The SAM domain three dimensional structure described herein, and the three dimensional 
structures of other polypeptides determined by the homology modeling, molecular replacement, and 
NMR techniques described herein can also be applied to modulator design and identification 
methods. 

50 
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Modulaiors of SAM domains may be identified by docking the computer representation of 
compounds from a database of molecules. Databases which may be used include ACD (Molecular 
Designs Limited), NCI (National Cancer Institute), CCDC (Cambridge Crystal Jographic Data 
Center), CAST (Chemical Abstract Service), Derwent (Derwent Information Limited), Maybridge 
(Maybridge Chemical Company Ltd), Aldrtch (Aldrich Chemical CompanyX DOCK (University of 
California in San Francisco), and the Directory of Natural Products (Chapman & Hall). Computer 
programs such as CONCORD (Tripos Associates) or DB-Con verier (Molecular Simulations Limited) 
can be used to convert a data set represented in two dimensions to one represented in three 
15 dimensions. 

1 0 Generally, the computer programs comprise the following steps: 

(a) docking the structure of a compound into an active-site of a polypeptide (e.g.. EphA4 
SAM domain) using the computer program, or by interactively moving the compound 
into the active-site; 

(b) characterizing the geometry and the complementary interactions formed between the 
1 5 atoms of the active-site and the compound; and optionally 

(c) searching libraries for molecular fragments which can fit into the empty space between 
the compound and active site and can be linked to the compound; and 

(d) linking the fragments found in (c) to the compound and evaluating the new modified 
compound. 

20 "Docking" refers to a process of placing a compound in close proximity with an active she 

of a polypeptide (e.g.. an Eph SAM domain), or a process of finding low energy conformations of a 
30 compound/porypeptide complex (e.g. corapound/Epb SAM domain). 

Examples of other computer programs that may be used for structure-based modulator 
design are CAVEAT (Bartlett et al., 1989, in "Chemical and Biological Problems in Molecular 
25 Recognition**, Roberts, S.M. Ley, S.V.; Campbell, N.M. eds; Royal Society of Chemistry: 
35 Cambridge, pp 182-196); FLOG (Miller et aL, 1994, J. Comp. Aided Molec. Design 8:153); PRO 

Modulator (Clark et al., 1995 J. Comp. Aided Molec. Design 9:13); MCSS (Miranker and Karplus, 
1991, Proteins: Structure, Fuction, and Genetics 8:195); and, GRID (Goodford, 1985, J. Med. Chem. 
28:849). 

In an embodiment of the invention, a method is provided for identifying potential 
modulators of SAM domain function, the method utilizes the structural coordinates of a SAM 
domain three dimensional structure. The method comprises the steps of (a) removing a computer 
representation of a SAM domain structure, preferably an Eph SAM domain structure, more 
preferably an EpnA4 SAM domain structure, and docking a computer representation of a compound 
from a computer data base with a computer representation of the active site of the SAM domain; (b) 
determining a conformation of the complex with a favourable geometric fit or favorable 
complementary interactions; and (c) identifying compounds that best fit the SAM domain active-she 
as potential modulators of SAM domain function. The initial SAM domain structure may or may not 
50 have compounds bound to it. A favourable geometric fit occurs when the surface areas of a 
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compound in a com pound- SAM domain complex is in close proximity with the surface area of the 
active-site of the SAM domain without forming unfavorable interactions. A favourable 
complementary interaction occurs where a compound in a compound- SAM domain complex interacts 
by hydrophobic, aromatic, tonic, or hydrogen donating and accepting forces, with the active-site of a 
5 SAM domain without forming unfavorable interactions. Unfavourable interactions may be steric 
hindrance between atoms in the compound and atoms in the SAM active-site. 

In another embodiment, potential modulators are identified utilizing a three dimensional 
structure of a SAM domain with or without compounds bound to it. The method comprises the steps 
15 of (a) modifying a computer representation of a SAM domain (e.g. an Eph SAM domain) having one 

10 or more compounds bound to it, where the computer representations of the compound or compounds 
and SAM domain are defined by atomic structural coordinates; (b) determining a conformation of the 
complex with a favorable geometric fit and favorable complementary interactions; and (c) identifying 
the compounds that best fit the SAM active site as potential modulators. A computer representation 
may be modified by deleting or adding a chemical group or groups. Computer representations of the 
1 5 chemical groups can be selected from a computer database. 

Another way of identifying potential modulators is to modify an existing modulator in the 
polypeptide active-site. The computer representation of modulators can be modified within the 
computer representation of a SAM domain active-site. This technique is described in detail in 
Molecular Simulations User Manual, 1995 in LUDL The computer representation of a modulator 
20 may be modified by deleting a chemical group or groups, or by adding a chemical group or groups. 
After each modification to a compound, the atoms of the modified compound and active-site can be 
shifted in conformation and the distance between the modulator and the active site atoms may be 
scored on the basis of geometric fit and favourable complementary interactions between the 
molecules. Compounds with favourable scores are potential modulators. 
25 Compounds designed by modulator building or modulator searching computer programs 

35 may be screened to identify potential modulators. Examples of such computer programs include 

programs in the Molecular Simulations Package (Catalyst), ISIS/HOST, ISIS/BASE, and 
ISIS/DRAW (Molecular Designs Limited), and UNITY (Tripos Associates). A building program may 
be used to replace computer representations of chemical groups in a compound complexed with a 
30 SAM domain with groups from a computer data base. A searching program may be used to search 
computer representations of compounds from a computer database that have similar three 
dimensional structures and similar chemical groups as a compound mat binds to a SAM domain. The 
programs may be operated on the structure of the active-site of the three dimensional structure of an 
Eph SAM domain, preferably an EphA4 SAM domain. 
4 ^ 35 A typical program may comprise the following steps: 

(a) mapping chemical features of the compound such as by hydrogen bond donors or 
acceptors, hydrophobic/lipophilic sites, positively kmizable sites, or negatively 
ionizable sites; 

50 (b) adding geometric constraints to selected mapped features; 
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(c) searching data bases with the model generated m (b). 
In an embodiment of the invention a method of identifying potential modulators of a SAM 
domain, preferably an Eph SAM domain, more preferably an EphA SAM domain, is provided using 
the three dimensional conformation of the SAM domain in various modulator construction or 
5 modulator searching computer programs on compounds complexcd with the SAM domain. The 
method comprises the steps of (a) removing a computer representation of one or more compounds 
complexcd with a SAM domain; (b) (i) searching a data base for a compound with a similar 
geometric structure or similar chemical groups to the removed compounds using a computer program 
that searches computer representations of compounds from a database that have similar three 
10 dimensional structures and similar chemical groups* or (ii) replacing portions of the compounds 
complexcd with the SAM domain with similar chemical structures (i.e. nearly identical shape and 
volume) from a database using a compound construction computer program that replaces computer 
20 representations of chemical groups with groups from a computer database, where the representations 

of the compounds are defined by structural coordinates. 
1 5 Potential modulators of SAM domains identified using the above-described methods may be 

prepared using methods described in standard reference sources utilized by those skilled in the art. 
For example, organic compounds may be prepared by organic synthetic methods described in 
references such as March, 1994 Advanced Organic Chemistry: Reactions, Mechanisms, and 
Structure, New York, McGraw Hill. 
20 Cellular assays, as well as animal model assays in vivo, may be used to test the activity of a 

potential modulator .of a SAM domain as well as diagnose a disease associated with inappropriate 
SAM domain activity. In vfvo assays are also useful for testing the bioactivity of a potential 
modulator designed by the methods of the invention. 

The invention also relates to a potential modulator identified by the methods of the 
25 invention. 
35 Peptides 

The invention provides peptide molecules that modulate SAM domain function. The 
molecules are derived from the interface residues necessary for dimer formation. For example, 
peptides of the invention include the amino acids Val 913, Val 914, Met 972, Met 976, Met 979, Val 
40 30 944, and Leu 940 of the EphA4 SAM domain. Other proteins containing sequences corresponding to 

the sequences necessary for dimer formation of a SAM domain may be identified with a protein 
homology search, for example by searching available databases such as GenBank or SwissProt and 
various search algorithms and/or programs may be used including FASTA, BLAST (available as a 
part of the GCG sequence analysis package, University of Wisconsin. Madison, Wis.), or ENTREZ 
35 (National Center for Biotechnology Information, National Library of Medicine, National Institutes of 
Health, Bethesda, MD). 

In accordance with an embodiment of the invention, specific peptides are contemplated that 
mediate SAM domain function comprising VVSV (SEQ ID. NO. 21), SAWSV (SEQ ID. NO.22), 
50 FSAW (SEQ ID. NO.23 ) r FSAVVSV (SEQ ID. NO. 24), FSAVVSVGD (SEQ ID. NO. 25), 



25 



30 



55 



10 



WO 00/37500 PCT/CA99/01209 

-16- 

VVSVGDWL (SEQ ID. NO. 26), FNTV (SEQ ID. NO. 27), FNTVDE (SEQ 10. NO. 28), 
FNTVDEWL (SEQ ID. NO. 29), TSFNTVDEWL (SEQ ID. NO. 30), TSFNTV (SEQ ID. NO. 3 IX 
YTSFNTV (SEQ ID. NO. 32), RSEV (SEQ ID. NO. 33), RSEVLG (SEQ ID. NO. 34), RSEVLGWD 
(SEQ ID. NO. 35), VPFRSEV (SEQ ID. NO. 36), and VPFRSEVLGW (SEQ ID. NO. 37). 
5 In accordance with another embodiment of the invention, specific peptides are contemplated 

that mediate SAM domain function. In particular, a peptide of the formula 1 is provided which 
mediates SAM domain function: 

15 X-X'-X J -X ? -X 4 -X*X* I 

10 

wherein X and X* represent 0 to 70, preferably 0 to 50 amino acids, more preferably 2 to 20 amino 
acids, and X 1 represents Leu, Pbe, Asp, Ala, Glu, or Gry, preferably Leu or Gh/ ? X 2 represents Ghi, 
20 Asp, Ser, He, Ala, Arg, Lys, and Gm, preferably Glu or Asp, X 5 represents Ala, Val, Glu, Phe, Ser, 

lie, Met, Leu, His, Gin, Arg, or Asp preferably Ala, Val, or Phe, X 4 is Val, Leu, Met, Phe, and He, 
15 preferably Val or Leu, or Pbe, X 5 is VaL Ser, Uu, Asp, Ala, Pro, Asn, Lys, or Cys, preferably Val or 
Ser. 

2^ b> an embodiment of the present invention a peptide of the formula 1 is provided: 

wherein X represents TT, ID, TS, DD, GYTT (SEQ ID. NO. 38), AAGYTT (SEQ ID. NO. 39), 
FTAAGYTT (SEQ ID. NO. 40% DNFTAAGYTT (SEQ ID. NO. 41), or YKDNFTAAGYTT (SEQ 
20 ID. NO. 42). In another embodiment X* represents HM, HMSQ (SEQ ID. NO. 43), HMSQD (SEQ 
ID. NO. 44), HMSQDD (SEQ ID. NO. 45), HMSQDDLA (SEQ ID. NO. 46), QMMM (SEQ ID. NO. 
47), QMMMED (SEQ ID. NO. 48), QMMMEDLL (SEQ ID. NO. 49), DITE (SEQ ID. NO. 50), 
DITEED (SEQ ID. NO. 51), DITEEDL (SEQ ID. NO. 52). NLTE (SEQ ID. NO. 53% NLTEND 
(SEQ ID. NO. 54), NLTENDI (SEQ ID. NO. 55). 
25 Preferred peptides of the formula I include the following: X-LEAW-X 6 , X-FDWS-X*, X- 

35 LEFLS-X 6 , X-GARFL-X*, LEAVV (SEQ ID. NO. 56), TTLEAW (SEQ ID. NO. 57), LEAWHM 

(SEQ ID. NO. 58% LEAVVHMSQ (SEQ ID. NO. 59), LEAVVHMSQD (SEQ ID. NO. 60), 
LEAVVHMSQDDL (SEQ ID. NO. 61), LEAWHMSQDDLAR (SEQ ID. NO. 62), 
TTLEAVVHMS (SEQ ID. NO. 63), TTLEA VVHMSQD (SEQ ID. NO. 64), TTLEA V VHM SQDDL 
40 30 (SEQ ID. NO. 65% TTLEA WHMSQDDLAR (SEQ ID. NO. 66), GYTTLEA W (SEQ ID. NO. 67). 

GYTTLEAVVHMS (SEQ ID. NO. 68% GYTTLEA WHMSQD (SEQ ID. NO. 69% 
GYTTLEA WHM SQDDL (SEQ ID. NO. 70% GYTTLEA WHMSQDDLAR (SEQ ID. NO. 71% 
FDVVS (SEQ ID. NO. 72), FDWSQ (SEQ ID. NO. 73% FDWSQMM (SEQ ID. NO. 74% 
FDWSQMMME (SEQ ID. NO. 75), FDWSQMMMEDIL (SEQ ID. NO. 76% TSFDVVS (SEQ ID. 
35 NO. 77), TSFDWSQ (SEQ ID. NO. 78% TSFDVVSQMM (SEQ ID. NO. 79), TSFDWSQMMME 
(SEQ ID. NO. 80% TSFDWSQMMMEDIL (SEQ ID. NO. 81% LEFLS (SEQ ID. NO. 82% LEFLSD 
(SEQ ID. NO. 83% LEFLSD1T (SEQ ID. NO. 84% LEFLS DITE E (SEQ ID. NO. 85% 
LEFLSDITEEDL (SEQ ID. NO. 86% DDLEFLS (SEQ ID. NO. 87% GWDDLEFLS (SEQ ID. NO. 
50 88% DDLEFLSD (SEQ ID. NO. 89), DDLEFLSDIT (SEQ ID. NO. 90% DDLEFI^DITEE (SEQ ID. 
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NO. 9!), DDLEFLSDITEEDL (SEQ ID. NO. 92), GARFL (SEQ ID. NO. 93X GARFLN (SEQ ID. 
NO. 94), GARFLNLT (SEQ ID. NO. 95), GARFLNLTEN (SEQ ID. NO. 96), and IDGARFL (SEQ 
ID. NO. 97> 

In accordance with another embodiment of the invention, specific peptides are contemplated 
that mediate SAM domain function. In particular, a peptide of the formula II is provided which 
mediates SAM domain function: 

x 7 -x , -x'-x 10 -x ,, -x l2 -x u .x M -x w -x ,s u 
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wherein X 7 and X* represent 0 to 70, preferably 0 to 50 amino acids, more preferably 2 to 20 amino 
acids, and X 1 represents Met, lie, Ser, Leu, Asa, Phe, or Val, preferably Met, X* represents Arg, Ser, 
Lys, Met, Leu, GIu, Gin, or Asn, preferably Gin or Arg, X 10 represents Thr, Ala, Arg, Leu, Ser, Glu, 
20 ASP' M <* Lvs > Gin, or Gly, preferably Thr, Ala, or Ghi, X n represents Gin, Ser, Gin, Leu, Phe, Asp, 

Thr, Arg, preferably Gm or Arg, X iJ represents Met, Ala, He, Asn, Ser, Arg, Thr, Pro, Leu, Gin, VaL 
1 5 Lys, preferably Met or Arg, X" represents Gin, Asn, Pro, Ser, Tyr, Glu, Leu, Arg, or Lys, preferably 
Gin, Asn, or Arg, X M represents Gin, Ala, Pro, Asp, Leu, Lys, He, GIu, Arg, or Asn, preferably Gm 
or He, and X 15 represents Met, He, Val, His, Ser, Arg, Lys, Pbe, Cys, GIu, Tyr, Ala, He, Trp, or Leu. 

In an embodiment of the present invention a peptide of the formula II is provided: 
wherein X 7 represents QA, QV, NK, SVQA (SEQ ID. NO. 98), LSSVQA (SEQ ID. NO. 99), 
20 ILSSVQA (SEQ ID. NO. 100), NKILSSVQA (SEQ ID. NO. 101), HQNK1LSSVQA (SEQ ID. NO. 
102), THQNKILSSVQA (SEQ ID. NO. 103), ENIK (SEQ ID. NO. 104), SQEINK (SEQ ID. NO. 
105), KLSQEINK (SEQ ID. NO. 106), ILNS1QV (SEQ ID. NO. 107), or NSIQV (SEQ ID. NO. 
108). In another embodiment X 7 is HG, QS, HGRM (SEQ ID. NO. 109), HGRMVP (SEQ ID. NO. 
1 10X QSVEV (SEQ ID. NO. II IX or TRKP (SEQ ID. NO. 1 12). 
25 Preferred peptides of the formula 1] include the following: X 7 -MRTQMQQM- X X 7 - 

35 MRAQMNQI-X". X'-NEERRSIF-X 16 , MRTQMQQM (SEQ ID. NO. 113), QAMRTQMQQM 

(SEQ ID. NO. 1 14), SVQAMRTQMQQM (SEQ ID. NO. 1 I5X LSSVQAMRTQMQQM (SEQ ID. 
NO. 116), ILSSVQAMRTQMQQM (SEQ ID. NO. 1 17X MRTQMQQMHG (SEQ ID. NO. 118X 
MRTQMQQMHGRM (SEQ ID. NO. 1I9X MRTQMQQMHG RMVPV (SEQ ID. NO. 120), 
4Q 30 NEERRSIF (SEQ ID. NO. 121), INKNEERRSIF (SEQ ID. NO. I22X NEERRSIFTRKP (SEQ ID. 

NO. 123). MRAQMNQI (SEQ ID. NO. 124), MRAQMNQ1QS (SEQ ID. NO. 125), 
MRAQMNQIQSVEV (SEQ ID. NO. 1 26). 

All of the peptides of the invention, as well as molecules substantially homologous, 
complementary or otherwise functionally or structurally equivalent to these peptides may be used for 
35 purposes of the present invention. In addition to full-length peptides of the invention, truncations of 
the peptides are contemplated in the present invention. Truncated peptides may comprise peptides of 
about 7 to 10 amino acid residues 

The truncated peptides may have an amino group (-NH2X a hydrophobic group (for 
50 example, carbobenzoxyl, dansyL or T-butyloxycarbonylX an acetyl group, a 9-fhiorenyImetboxy- 
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carbonyl (PMOQ group, or a macromolecule including but not limited to lipid-fatty acid conjugates, 
pory ethylene glycol or carbohydrates at the amino terminal end. The truncated peptides may nave a 
car boxy! group, an amido group, a T-butyloxycarbonyl group, or a macromolecule including but not 
limited to lipid-fatty acid conjugates, polyethylene glycol, or carbohydrates at the carboxy terminal 
S end. 

The peptides of the invention may also include analogs of a peptide of the invention and/or 
truncations of the peptide, which may include, but are not limited to a peptide of the invention 
containing one or more amino acid insertions, additions, or deletions, or both. Analogs of the peptide 
15 of the invention exhibit the activity characteristic of the peptide e.g. interference with SAM domain 

10 dimer formation, and may further possess additional advantageous features such as increased 
bioavailability, stability, or reduced host immune recognition. One or more amino acid insertions may 
be introduced into a peptide of the invention. Amino acid insertions may consist of a single amino 
acid residue or sequential amino acids. 

One or more amino acids, preferably one to five amino acids, may be added to the right or 
15 left termini of a peptide of the invention. Deletions may consist of the removal of one or more amino 
acids, or discrete portions from the peptide sequence. The deleted amino acids may or may not be 
contiguous. The lower limit length of the resulting analog with a deletion mutation is about 7 amino 
acids. 

The invention also includes a peptide conjugated with a selected protein, or a selectable 
20 marker (see below) to produce fusion proteins. 

The peptides of the invention may be prepared using recombinant DNA methods. 
Accordingly, nucleic acid molecules which encode a peptide of the invention may be incorporated in 
a known manner into an appropriate expression vector which ensures good expression of the peptide. 
Possible expression vectors include but are not limited to cosmids, plasmids, or modified viruses so 
25 long as the vector is compatible with the host cell used. The expression vectors contain a nucleic acid 
35 molecule encoding a peptide of the invention and the necessary regulatory sequences for the 

transcription and translation of the inserted protein-sequence. Suitable regulatory sequences may be 
obtained from a variety of sources, including bacterial, fungal, viral, mammalian, or insect genes. 
(For example, see the regulatory sequences described in Goeddel, Gene Expression Technology: 
30 Methods in Enzymology 185, Academic Press, San Diego. CA (1990). Selection of appropriate 
regulatory sequences is dependent on the host cell chosen, and may be readily accomplished by one 
of ordinary skill in the art. Other sequences, such as an origin of replication, additional DNA 
restriction shes, enhancers, and sequences conferring inducibility of transcription may also be 
incorporated into the expression vector. 
45 35 The recombinant expression vectors may also contain a selectable marker gene which 

facilitates the selection of transformed or transfected host cells. Suitable selectable marker genes are 
genes encoding proteins such as G4 18 and hygromycin which confer resistance to certain drugs, (*- 
galactosidase, chloramphenicol acetylrransferase, firefly luciferase, or an immunoglobulin or portion 
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thereof such as the Fc portion of an immunoglobulin preferably JgG. The selectable markers may be 
introduced on a separate vector from the nucleic acid of interest. 

The recombinant expression vectors may also contain genes thai encode a fusion portion 
which provides increased expression of the recombinant peptide; increased solubility of tbe 
recombinant peptide; and/or aid in the purification of the recombinant peptide by acting as a iigand in 
affinity purification. For example, a proteolytic cleavage site may be inserted in the recombinant 
peptide to allow separation of the recombinant peptide from the fusion portion after purification of 
the fusion protein. Examples of fusion expression vectors include pGEX (Arorad Corp., Melbourne, 
Australia), pMAL (New England Biolabs, Beverly, MA) and pRITS (Pharmacia, Piscataway, NJ) 
which fuse glutathione S- transferase (GST), maltose E binding protein, or protein A, respectively, to 
the recombinant protein. 

Recombinant expression vectors may be introduced into host cells to produce a transform ant 
host cell. Transform ant host cells include prokaryotic and eukaryotic cells which have been 
transformed or transfected with a recombinant expression vector of the invention. The terms 
"transformed with", "transfected with", "transformation" and "transfection" are intended to include 
the introduction of nucleic acid (e.g. a vector) into a cell by one of many techniques known in the art 
For example, prokaryotic cells can be transformed with nucleic acid by electroporation or calcium- 
chloride mediated transformation. Nucleic acid can be introduced into mammalian cells using 
conventional techniques such as calcium phosphate or calcium chloride co-precipitation, DEAE- 
dextran-mediated transfectjon, lipofectin, electroporation or microinjection. Suitable methods for 
transforming and transfectmg host cells may be found in Sam brook et al. (Molecular Cloning: A 
Laboratory Manual, 2nd Edition, CoH Spring Harbor Laboratory press (1989)X and other laboratory 
textbooks. 

Suitable host cells include a wide variety of prokaryotic and eukaryotic host cells. For 
example, the peptides of the invention may be expressed in bacterial cells such as £ co/i, insect cells 
(using baculovirus), yeast cells or mammalian cells. Other suitable host cells can be found m 
Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, 
CA(199I). 

The peptides of the invention may be tyrosine phosphorylated using the method described in 
Reedijk et al. (The EMBO Journal 1 1 (4): 1365, 1992). For example/tyrosine phosphorylation may be 
induced by infecting bacteria harbouring a plasmid containing a nucleotide sequence encoding a 
peptide of the invention, with a Xgtl 1 bacteriophage encoding tbe cytoplasmic domain of the Elk 
tyrosine kinase as a LacZ-Elk fusion. Bacteria containing the plasmid and bacteriophage as a rysogen 
are isolated. Following induction of the rysogen, the expressed peptide becomes phosphorylated by 
the Elk tyrosine kinase. 

The peptides of the invention may be synthesized by conventional techniques. For example, 
the peptides may be synthesized by chemical synthesis using solid phase peptide synthesis. These 
methods employ either solid or solution phase synthesis methods (see for example, J. M. Stewart, and 
J J). Young, Solid Phase Peptide Synthesis, T* Ed., Pierce Chemical Co., Rockford HI. (1984) and G. 
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Barany and ILB. Merrifield, The Peptides: Analysis Synthesis, Bblogy editors E. Gross and J. 
Merenhofer Vol. 2 Academic Press, New York, 1980, pp. 3-254 for solid phase synthesis techniques; 
and M Bodansky, Principles fo Peptide Synthesis, Springer-Verlag, Berlin 1984, and E. Gross and J. 
Meicnbofer, Eds., The Peptides: Analysis, Synthesis, Biologu, suprs. Vol 1, for classical solution 
5 synthesis). By way of example, the peptides may be synthesized using 9-fluorenyl metboxycarbonyl 
(Fmoc) solid phase chemistry with direct incorporation of phospbotyrosine as the N- 
fluorenyhnethoxy-carbonyl-O-dimethyl phosphono-L-tyrosine derivative. 

N-terminal or C-terminaJ fusion proteins comprising a peptide of the invention conjugated 
15 with other molecules may be prepared by fusing, through recombinant techniques, the N-terminal or 

1 0 ( .-term ma! of the peptide, and the sequence of a selected protein or selectable marker with a desired 
biological function. The resultant fusion proteins contain the peptide fused to the selected protein or 
marker protein as described herein. Examples of proteins which may be used to prepare fusion 
20 proteins include immunoglobulins, gmtamione-S-transferase (GST), hemagglutinin (RAX and 

truncated myc. 

15 Cyclic derivatives of the peptides of the invention are also part of the present invention. 

Cyclization may allow the peptide to assume a more favorable conformation for association with 
molecules in complexes of the invention. Cyclization may be achieved using techniques known in the 
art For example, disulfide bonds may be formed between two appropriately spaced components 
having free sulfbydryl groups, or an amide bond may be formed between an amino group of one 
20 component and a carboxyl group of another component Cyclization may also be achieved using an 
azobenzene-containing amino acid as described by Urysse, L., et ah, J. Am. Chem. Soc. 1995, 1 17, 
30 8466-8467. The side chains of Tyr and Asn may be linked to form cyclic peptides. The components 

that form the bonds may be side chains of amino acids, non-amino acid components or a combination 
of the two. In an embodiment of the invention, cyclic peptides are contemplated that have a beta-turn 
25 in the right position. Beta-turns may be introduced into the peptides of the invention by adding the 
35 amino acids Pro-Gly at the right position. 

It may be desirable to produce a cyclic peptide that is more flexible than the cyclic peptides 
containing peptide bond linkages as described above. A more flexible peptide may be prepared by 
introducing cysteines at the right and left position -of the peptide and forming a disuiphide bridge 
30 between the two cysteines. The two cysteines are arranged so as not to deform the beta-sheet and 
turn. The peptide is more flexible as a result of the length of the disulfide linkage and the smaller 
number of hydrogen bonds in the beta-sheet portion. The relative flexibility of a cyclic peptide can be 
determined by molecular dynamics simulations. Peptide mimetics may be designed based on 
information obtained by systematic replacement of L-amino acids by D- amino acids, replacement of 

45 

35 side chains with groups having different electronic properties, and by systematic replacement of 
peptide bonds with amide bond replacements. Local conformational constraints can also be 
introduced to determine conformational requirements for activity of a candidate peptide mimetic. The 
mimetics may include hosteric amide bonds, or D- amino acids to stabilize or promote reverse turn 
50 conformations and to help stabilize the molecule. Cyclic amino acid analogues may be used to 
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constrain amino acid residues to particular conformational slates. The mrmetics can also include 
mimics of inhibitor peptide secondary structures. These structures can model the 3 -dimensional 
orientation of amino acid residues into the known secondary conformations of the proteins. Peptoids 
may also be used which are oligomers of N-substjtuted amino acids and can be used as motifs for the 
5 generation of chemically diverse libraries of novel molecules. 

Peptides of the invention may be developed using a biological expression system. The use of 
these systems allows the production of large libraries of random peptide sequences and the screening 
of these libraries for peptide sequences that interact with particular amino acid residues. Libraries 
15 may be produced by cloning synthetic DNA that encodes random peptide sequences into appropriate 

10 expression vectors, (see Christian et al 1992, J. Mol. Biol. 227:711; Devlin ct a!, 1990 Science 
249:404; Cwirla et al 1990, Proc. Natl. Acad, Sci. USA, 87:6378). Libraries may also be constructed 
by concurrent synthesis of overlapping peptides (see U.S. Pat No. 4,708,871 > 
20 Peptides of the invention may be used to identify lead compounds for drug development. 

The structure of the peptides described herein can be readily determined by a number of methods 
15 such as NMR and X-ray crystallography. A comparison of the structures of peptides similar in 
sequence, but differing in the biological activities they elicit in target molecules can provide 
information about the structure-activity relationship of the target Information obtained from the 
examination of structure-activity relationships can be used to design either modified peptides, or 
other smalt molecules or lead compounds which can be tested for predicted properties as related to 
20 the target molecule. The activity of the lead compounds can be evaluated using assays similar to 
those described herein. 

Information about stmcture-activity relationships may also be obtained from co- 
crystallization studies. In these studies, a peptide with a desired activity is crystallized in association 
with a target molecule Le. SAM domain, and the X-ray structure of the complex is determined. The 
25 structure can then be compared to the structure of the target molecule in hs native state, and 
35 information from such a comparison may be used to design compounds expected to possess desired 

activities. 

The peptides of the invention may be converted into pharmaceutical salts by reacting with 
• inorganic acids such as hydrochloric acid, sulfuric acid, hydrobroraic acid, phosphoric acid, etc., or 
40 30 organic acids such as formic acid, acetic acid, propionic acid, grycolk acid, lactic acid, pyruvic acid, 

oxalic acid, succinic acid, malic acid, tartaric acid, citric acid, benzoic acid, salicylic acid, 
benezenesulfonic acid, and toluenesulfontc acids. The peptides of the invention may be used to 
prepare antibodies. Conventional methods can be used to prepare the antibodies. 

The peptides and antibodies specific for the peptides of the invention may be labelled using 
35 conventional methods with various enzymes, fluorescent materials, luminescent materials and 
radioactive materials. Suitable enzymes, fluorescent materials, luminescent materials, and radioactive 
materia] are well known to the skilled artisan. Antibodies and labeled antibodies specific for the 
peptides of the invention may be used to screen for proteins containing SAM domains. 
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Computer modelling techniques known m the art may also be used to observe the interaction 
of a peptide of the invention, and truncations and analogs thereof with a SAM domain (for example, 
Homology Insight D and Discovery available from BioSym/Molecular Simulations, San Diego, 
California, U.S.A.). If computer modelling indicates a strong mteractkm, the peptide can be 
5 synthesized and tested for its ability to interfere with SAM domain dimer formation. 
Compositions and Methods of Treatment 

A purified three dimensional SAM domain structure of the invention, the peptides of the 
invention, and the modulators identified using the methods of the invention may be used to modify 
the inappropriate activity of a SAM domain involved in a clinical disorder. They may be used in the 
10 treatment and diagnosis of disorders associated with aberrant T cell signaling and to modulate 
telomere function. In particular, they may be useful in methods for therapy of cellular senescence and 
immortalization controlled by telomere length and telomerase activity, and as selective 
20 immunosuppressants (e.g. in organ transplantation). They may also be useful in the treatment of 

cancers, such as melanoma, ocular melanoma, leukemia, astrocytoma, glioblastoma, lymphoma, 
15 glioma, Hodgkin's lymphoma, multiple myeloma, sarcoma, myosarcoma, cholangiocarcinoroa, 
squamous cell carcinoma, CLL, and cancers of the pancreas, breast, brain, prostate, bladder, thyroid, 
25 ovary, uterus, testis, kidney, stomach, colon and rectum, particularly leukemia including B-cell 

leukemia, T-cell leukemia, null-cell leukemia, myelogenous leukemia, and lymphocytic leukemia, 

Further, the three dimensional SAM domain structure of the invention, the peptides of the 
20 invention, and the modulators identified using the methods of the invention may be used to modulate 
the biological activity of an Eph receptor or Eph ligand in a cell, including mhibning or enhancing 
signal transduction activities of the receptor or ligand, and in particular modulating a pathway in a 
cell regulated by the ligand or receptor, particularly those pathways involved in neuronal 
development, axonal migration, paih finding and regeneration. The three dimensional SAM domain 
25 structure of the invention, the peptides of the invention, and modulators identified using the methods 
35 of the invention will be useful as pharmaceuticals to modulate axonogenesis, nerve cell interactions 

and regeneration, to treat conditions such as neurodegenerative diseases and conditions involving 
trauma and injury to the nervous system, for example Alzheimer's disease, Parkinson's disease, 
Huntington's disease, deroyelinating diseases, such as multiple sclerosis, amyotrophic lateral 
4Q 30 sclerosis, bacterial and viral infections of the nervous system, deficiency diseases, such as Wernicke's 

disease and nutritional polyneuropathy, progressive supranuclear palsy, Shy Onager's syndrome, 
multistem degeneration and olivo porno cerebellar atrophy, peripheral nerve damage, and trauma and 
ischemia resulting from stroke. 

The present invention thus provides a method for treating cancer (e.g. leukemia), and 
35 disorders associated with T cell signaling, modulating telomere function, or affecting neuronal 
development or regeneration, in a subject comprising administering to a subject an effective amount 
of a three dimensional SAM domain structure of the invention, a peptide of the invention, or a 
modulator identified using the methods of the invention. The invention also contemplates a method 
50 for stimulating or inhibiting axonogenesis in a subject comprising administering to a subject an 
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effective amount of a three dimensional SAM domain structure of the invention, a peptide of the 
invention, or a modulator identified using the methods of the invention. 

The invention still further relates to a pharmaceutical composition which comprises a 
purified three dimensional SAM domain structure of the invention, a peptide of the invention, or a 
modulator identified using the methods of the invention, and a pharmaceuticaJry acceptable carrier, 
diluent or excipient The pharmaceutical compositions may be used to stimulate or inhibit neuronal 
development, regeneration and axonaJ migration associated with neurodegenerative conditions, and 
conditions involving trauma and injury to the nervous system. They may also be used to treat cancer 
and disorders associated with T cell signaling, and modulate telomere function. 

The compositions of the invention are administered to subjects in a biologically compatible 
form suitable for pharmaceutical administration in vivo. By "biologically compatible form suitable 
for administration in vivo*' is meant a form of the protein to be administered in which any toxic 
20 effects are outweighed by the therapeutic effects of the protein. The term subject is intended to 

include mammals and includes humans, dogs, cats, mice, rats, and transgenic species thereof. 
1 5 Administration of a therapeutically active amount of the pharmaceutical compositions of the present 
invention is defined as an amount effective, at dosages and for periods of time necessary to achieve 
the desired result. For example, a therapeutically active amount of a three dimensional SAM domain 
structure of the invention, peptides of the invention, or modulators of the invention may vary 
according to factors such as the condition, age, sex, and weight of the individual. Dosage regimes 
may be adjusted to provide the optimum therapeutic response. For example, several divided doses 
may be administered daily or the dose may be proportionally reduced as indicated by the exigencies 
30 of the therapeutic situation. 

The active compound may be administered in a convenient manner such as by injection 
(subcutaneous, intravenous, etc.), oral administration, inhalation, transdermal application, or 
25 intracerebral administration. In particular embodiments, pharmaceutical compositions of the 
35 invention are administered directly to the peripheral or central nervous system, for example by 

administration intracerebralry. 

A pharmaceutical composition of the invention can be administered to a subject in an 
appropriate carrier or diluent, co-administered with enzyme inhibitors or in an appropriate carrier 
40 30 sucn 35 microporous or solid beads or liposomes. The term "pharmaceutical^ acceptable carrier" as 

used herein is intended to include diluents such as saline and aqueous buffer solutions. Liposomes 
include water- in-oi I- in- water emulsions as well as conventional liposomes (Strejan et al., (1984) J. 
Neuroimmunol 7:27). The active compound may also be administered parenterally or 
mtraperitonealfy. Dispersions can abo be prepared in glycerol, liquid polyethylene glycols, and 
mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations may 
contain a preservative to prevent the growth of microorganisms. Depending on the route of 
administration, the active compound may be coated to protect the compound from the action of 
enzymes, acids and other natural conditions which may inactivate the compound. 
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Thc pharmaceutical compositions may be administered locally to stimulate axonogenesis 
and pathfinding, Tor example the compositions may be administered in areas of local nerve injury or 
in areas where normal nerve pathway development has not occurred. The pharmaceutical 
compositions may also be placed in a specific orientation or alignment along a presumptive pathway 
5 to stimulate axon pathfinding along that line, for example the pharmaceutical compositions may be 
incorporated on microcaniers laid down along the pathway. In particular, the pharmaceutical 
compositions of the invention may be used to stimulate formation of connections between areas of 
the brain, such as between the two hemispheres or between the thalamus and ventral midbrain. The 
pharmaceutical compositions may be used to stimulate formation of the medial tract of the anterior 
1 0 commissure or the babenular mlerpeduncle. 

Therapeutic administration of polypeptides may also be accomplished using gene therapy, A 
nucleic acid including a promoter operativery linked to a heterologous polypeptide may be used to 
20 produce high-level expression of the polypeptide in cells transfected with the nucleic acid. DNA or 

isolated nucleic acids may be introduced into cells of a subject by conventional nucleic acid delivery 
15 systems. Suitable delivery systems include liposomes, naked DNA, and receptor-mediated delivery 
systems, and viral vectors such as retroviruses, herpes viruses, and adenoviruses. 

The following non-limiting example is illustrative of the present invention: 
EXAMPLE 

The following methods were used to determine the crystal structure of the SAM domain of 
20 the Eph receptor isoforro A4. 

Protein expression, mutagenesis and purification: The SAM domain of the Eph receptor isoforro 
A4 (residues 890 to 98 1 ) was expressed in K. colt as a GST fusion protein using the pGEX-2T vector 
(Pharmacia). The Quickchange kit (Stratagene) was used to generate site directed mutants for 
dimerization analysis and for heavy atom phasing. Protein was purified by affinity chromatography 
25 using glutathione Sepharose beads (Pharmacia). Bound protein was eluted by cleavage with 
35 thrombin. After concentrating to 10 mM, protein was applied to a Superdex 75 gel filtration column 

(Pharmacia) for final purification and characterization. 

Crystallization and data collection: Hanging drops containing I pi of 100 mg/ml native or mutant 
(Glu 94 1 Cs) protein in 7mM Hepes pH 7.5 were mixed with equal volumes of reservoir buffer 
4Q 30 containing 1 00 mM cacody late pH 6.5, 7% (w/v) PEG 8000, and 20% ( vtv) ethylene glycol. Rod like 

crystals of approximate dimensions 0.05 x 0.05 x 0.2 mm were obtained overnight The crystals 
contain one molecule of the EphA4 SAM domain per asymmetric unit, and belong to the space group 
P6 4 , (a = b - 77.14 A, c = 24.37 A). The solution dimer corresponds to a crystaflographic dimer 
generated from the asymetric unit bv a two fold rotation parallel to the unique crystal axis. Crystals 

45 

35 were cryo-protected in reservoir buffer enriched to 20% (w/v) PEG 8000 and 20% (v/v) ethylene 
glycol prior to stream freezing. Heavy atom derivatives were prepared by soaking crystals overnight 
in 1-10 mM heavy atom solution prepared in cryo-protection buffer. 
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Nativc and derivative diffraction data were collected on frozen crystals (108°K) using a 
Man 345 imaging plate detector system with an RU200 rotating anode generator (Table 1). Data 
processing and reduction was carried out with the HKJL, DENZO, and SCALEPACK programs. 

Single isomcrphous replacemem (SIR) protein phases were calculated using lead derivative 
5 data collected on two separate protein crystals. The heavy atom site was identified by the Patterson 
search program HASSP perwilHger, 1987]. A Gni 941 to Cys site directed mutant of the EphA4 
SAM domain construct was employed fnr mercury d privatization. The heavy atom position of the 
mercury derivative data, which was collected on three separate crystals, was identified by difference 
Fourier synthesis. Multiple isoroorpbous replacement and anomalous scattering (MIRAS) phases, 
10 using only the lead derivative anomalous signal, were calculated and iterative rounds of automatic 
solvent boundary determination/density modification were performed using the PHASES package 
fFurey, 1990]. The resultant experimental electron density map allowed for the complete tracing of 
20 the SAM domain backbone structure. 

Model boBding and Refinement: Model building was performed using O [Jones, 1991]. A starting 
15 model comprising approximately 65% of the total structure was refined using XPLOR [Brunger, 
1992], Bulk solvent correction was applied during refinement and simulated annealing protocols were 
2 5 employed. The remaining structure was built into 2F„-F t electron density maps generated with 

XPLOR. The final refinement statistics are shown in Table I. The first 20 residues of the SAM 
domain construct are disordered (residues 890 to 909) and have not been modeled. No amino acid 
20 residues occupy disallowed regions of the Ramachandran plot and 94 % occupy the most favored 
regions. 

30 Results: 

The X-ray crystal structure of the SAM domain from the EphA4 receptor tyrosine kinase 
(Table 1 and 2) was determined. The boundaries of the structure were defined by limited proteolysis 
25 and mass- spectrometry. Overall, the structure of the homodimer is oblong and arises from the 
35 association of two 'lobster claw* shaped subunits. Each subunrt possesses a globular fold consisting 

of an N-terminal extended strand segment, followed by four short a helices (al to a4) and one long 
C-terminal helix a5 (Figure 2A, 2B, and 2C). The N- and the C-termini are located on one side of 
the subunit fold, similar to other protein interaction modules with signaling function (SH3, SH2, PH 
40 30 domains etc.) (Kuriyan, 1997]. However, in contrast to these other domains, the termini compose the 

functional end of the molecule rather than lying opposite to the ligand-binding surface. As shown in 
Figure 3A and 3B, the N-terminal strand region and the C-terminal helix ct5 extend from the subunit 
core and interdigitate in a pincer like manner with the termini of a second subunit, to form an 
45 elaborate drmer interface. In addition to the N- and C-terminal regions, a-helices al and a3 

35 contribute side chains to the dimer interface. 

The N-terminal strands cross in an anti-parallel manner and project the side chains of Ala 
912, Val 913, Val 914 and Phe 910 downward to form one mandible of the 'lobster claw' shaped 
subunit. The C-terminal helices a5 also cross in an anti-parallel manner with each a-helix projecting 
the side chains of Met 972, Met 976, and Met 979, upwards to form the second mandible. Together 
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these side chains compose a hydrophobic core that i$ fully continuous with those of the individual 
summits. Residues bridging the submit and interface cores include Trp 919, Ala 922 and lie 923 from 
hehx at and Leu 940 and Val 944 from helix a3. Complementing these hydrophobic interactions, 
the conserved side chain of arginine 973 forms intermolecular electrostatic interactions with the free 
5 carboxylate of glycine 981 and a stabilizing charge/helix dipole interaction with the C-termimis of 
helix a5 (Figure 2C). Additional polar residues located at or in close proximity to the dimer interface 
include His 980, Gm 975, His 945, Gm 977, Glu 941 and Ser 911. 

In order to identify determinants of dhnerization and to test that the crystal lographic dimer 
model reflects the solution structure of the EphA4 SAM domain, SAM domain residues, cither singly 
10 or in combination, were substituted and the behaviour of these mutants was tested using size 
exclusion chromatography (Figure 4). In agreement with predictions from the crystal structure, 
mutations involving the interface residues Val 913, Val 914, Mel 972, Met 976, Met 979, Val 944, 
20 and Leu 940 abolished dimer formation, in contrast, mutation of Val 969 to Ala, which comprises 

part of the second hydrophobic surface region (Figure 3 A and 3B), did not affect dimerization while 
1 5 mutation of the proximal residue lie 959 to Lys, appeared to disrupt the integrity of the subunit fold. 
Additionally, mutation of the surface exposed residues Glu 941, Asp 949, and Ser 968 to cysteine, 
did not disrupt SAM domain dimerization. In summary, the mutagenesis results are consistent with 
and support the notion that the SAM domain dimer observed in the crystal structure represents a 
mechanism through which the SAM domain associates in solution. 
20 To investigate whether the dimer model for the Eph receptor SAM domain has more general 

relevance for SAM domain containing proteins, the predicted locations of residues that are required 
for the dimerization of SAM domains on other polypeptides were examined. When mutations that 
map to conserved features of the subunit core and therefore are likely to disrupt the subunit fold air 
eliminated, a number of informative mutations stand out. For example, the homo- and hetero-typic 
25 dimerization of the Polycomb family of transcriptional repressors ph, RAE28 and Scro, is abolished 
by mutation of two residues predicted to map to the dimer interface [Kyba, 1 998]. These residues, lie 
62 and Trp 1 of the ph SAM domain, correspond to the N-terminal strand residue Phe 91 0 and the ct5 
belix residue Mel 972, respectively, of the EphA4 SAM domain. Both residues are highly conserved 
amongst the SAM domains and yet are unlikely to affect the individual subunit fold. The mutation of 
40 30 me ktter residue (Met 972 to Lys) in the EphA4 SAM domain yields a compact monomer structure 

(Figure 4). In addition, the hetero-dimerization of the SAM domain containing proteins Byr2p and 
Ste4p is disrupted by the substitution of Arg 69 with cysteine[Tu, 1997 #25]. This mutation maps to 
the interface residue Gm 977 of the EphA4 SAM dimer, and is located at the crossing she of the two 
a5 helices. Taken together, these observations indicate that the dimer structure of the EphA4 SAM 
35 domain may reflect a more general mode of SAM domain dimerization. 

The crystal lographic model for SAM domain dimerization is attractive for a number of 
reasons. Firstly, in the case of the Eph receptors, the linkers between the SAM and the catalytic 
domains is short (5 residues of poorly conserved sequence) so that the N-termini of the dimer would 
have to be oriented in the same direction and in close proximity if the kinase domains of clustered 
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receptors were to be juxtaposed. Tbe structure shows this to be the case. Secondly, the mechanism 
of djmerization revealed by the structure could account for the observation that the SAM domain b 
found at either terminus of signaling proteins. Because the N- and C-terminal ends of the SAM 
domain compose the dimer interface, the insertion of a SAM domain at an interna) site in a 
5 polypeptide chain would sterkaJly restrict access to a second SAM domain, especially if the host 
sequence was itself structured. The solutions to this dilemma would be to place a SAM domain at the 
end of a protein (as is usually observed), or to surround it with long linker sequences. In this regard 
the SAM domain differs from modules such as SH2 and SH3 domains, which can readily be located 
15 at internal positions in a polypeptide chain since the Ugand-binding she is located opposite to the 

!0 location of the N- and C-termini fKuriyan, 1997]. Thirdly, in the case of the h'prins we have noted 
three adjacent SAM domains in a region previously shown to mediate liprin hetero- dimerization 
[Sena-Pages, 1998 J. Because the C-termini of the dimerized SAM domain are in close proximity, on 
20 lft * opposite side from the N-termini, a configuration of stacked SAM domains can be readily 

envisioned. 

15 SAM dimerization may contribute to receptor oligomerization and activation by bringing 

catalytic elements into proximity for airtophospborylation. The SAM domain may have a direct 
inhibitory interaction with the kinase domain that can be competed away by dimerization. 
Alternatively SAM domain mediated dimerization might maintain opposing catalytic domains in a 
mutually inaccessible, and thus repressed stale. The Eph SAM domains might also recruit signaling 
20 partners through hetexomeric SAM-SAM interactions, or through specific recognition of cytoplasmic 
proteins by the Eph SAM dimer. 

SAM dimerization might be constitutive, but controlled through co-operative or antagonistic 
interactions with other clustering forces. Dimerization could potentially be controlled by 
modifications such as tyrosine phosphorylation, and indeed a residue within the SAM domain of the 
25 EphBI receptor can become tyrosine phosphorylated in vivo (Stein, 1996]. Finally, the five residues 
35 that lie C-terminal to the Eph SAM domain represent a potential binding she for PDZ domain 

proteinsfHock, 1998),which might influence the organization of the SAM domain. 

The structure of the EphA4 domain reveals a novel mechanism through which modular 
domains control protein -protein interactions. Since SAM domains are found m cell surface receptors, 
30 cytoplasmic signaling proteins, and transcriptional activators and repressors, as well as chimeric 
human oncoproteins, these results have general implications for understanding the formation of 
complexes involved m normal and oncogenic signal transduction. 

Having illustrated and described the principles of the invention in a preferred embodiment, it 
should be appreciated to those skilled in the art that the invention can be modified in arrangement and 
45 35 detail without departure from such principles. All modifications coming within the scope of the 

following claims are claimed. 

All publications, patents and patent applications referred to herein arc incorporated by 
reference in their entirety to the same extent as if each individual publication, patent or patent 
50 application was specifically and individually indicated to be incorporated by reference in its entirety. 
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Detailed Description of the Drawings 

Figure 1 A shows a sequence alignment of SAM domains from selected proteins. Secondary 
structure is indicated for the SAM domain from the EphA4 receptor tyrosine kinase. Residue 
numbers for the start of each SAM domain are shown on the left and Genebank accession numbers on 
5 the right Conserved hydrophobic residues are colored green, acidic residues red; basic residues 
blue, polar residues orange and glycines are colored pink. Residues at the dimer interface shown in 
Figure 2C are indicated (•). Liprin a I contains 3 SAM domains designated SI, S2 and S3. 

Figure IB shows a selection of multi-domain proteins containing SAM domain (S) is shown. 
Domains listed include, tyrosine or serine/threonine kinase catalytic domains, myosin-like domain, F- 
10 actin binding domain (F-actin BD), PDZ domain, SH2 domain, inositol phosphatase catalytic domain 
(inositol p' tase), GTPase activating domain (GAPX DNA-binding domain (DNA-BD) and a 
transmembrane region (TM). 

20 Figure 2A, 2B, and 2C Ribbons depiction of the SAM homo-dim er viewed (Figure 2A) 

down the twofold symmetry axis and (Figure 2B) perpendicular to the symmetry axis. The dimer 
15 subunits are coloured red and blue and a-helices are labeled. (Figure 2C) Ribbons stereo view 
highlighting the dimer interface region. Aromatic, aliphatic, methionine, histidine and arginine 
interacting side chains are coloured light bhie, green, yellow, orange, and blue (see Figure 1A for 
residue identification). All ribbon diagrams were generated using RIBBONS [Carson, 1991). 

Figure 3 A, B. Molecular surface and worm representations of the SAM homodimer. The 
20 molecular surface of one subunit is shown with hydrophobic (Met, VaL Leu, lie, Pbe.X basic (Arg, 
Lys) and acidic (Glu, Asp) side chains coloured green, blue and red, respectively. The two • 
perspectives differ by a 90* rotation about the vertical axis. In Figure 3B the twofold rotation axis 
relating the two subunits of the dimer is shown. The buried surface area of the dimer interface is 
1923 A. AH molecular surfaces were generated using GRASP (Nicholls, J 991). 
25 Figure 4. Gel filtration elution profile of wild type and single or double site mutants of the 

35 EphA4 receptor SAM domain. Chromatograms correspond to the loading of equivalent 

concentrations ( 1 0 mM) and total volumes (100 ul) of protein on a Superdex-75 gel nitration column 
(24 ml bed volume). The column was calibrated using Pharmacia low molecular weight standards. 
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928 6.159 14.797 
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14 . 


94 


55 


ATOM 


463 


N 


ILE 


959 


1 . 


282 


13.156 


8.019 


1 


.00 


14. 


35 




ATOM 


464 


H 


ILE 


959 


0. 


709 


12.36 0 


8.024 


1. 


.00 


10. 


00 




ATOM 


465 


CA 


ILE 


959 


1. 


286 


14.017 


6.838 


1 , 


.00 


16. 


17 




ATOM 


466 


CB 


ILE 


959 


0. 


2B6 


13.521 


5.769 


1. 


.00 


17. 


13 




ATOM 


467 


CG2 


ILE 


959 


0. 


351 


14.410 


4.532 


1. 


.00 


17. 


99 
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ATOM 468 CGI ILE 959 

ATOM 469 CD1 ILE 959 

ATOM 470 C I LB 959 

ATOM 471 O ILE 959 

5 ATOM 472 N THR 960 

ATOM 473 H THR 960 

ATOM 474 CA THR 960 

' ATOM 475 CB THR 960 

ATOM 476 OG1 THR 960 

10 ATOM 477 HG1 THR 960 

ATOM 478 CG2 THR 960 

ATOM 479 C THR 960 

ATOM 480 O THR 960 

ATOM 481 N HIS 961 

15 ATOM 482 H HIS 961 

ATOM 483 CA HIS 961 

ATOM 484 CB HIS 961 

ATOM 4 85 CG HIS 961 

ATOM 486 CD2 HIS 961 

20 ATOM 487 ND1 HIS 961 

ATOM 486 HD1 HIS 961 

ATOM 489 CB1 HIS 961 

ATOM 490 NE2 HIS 961 

ATOM 491 HE2 HIS 961 

25 ATOM 492 C HIS 961 

ATOM 493 O HIS 961 

ATOM 494 N GLN 962 

ATOM 495 H GLN 962 

ATOM 496 CA GLN 962 

30 ATOM 497 CB GLN 962 

ATOM 498 CG GLN 962 

ATOM 499 CD GLN 962 

ATOM . 500 OBI GLN 962 

ATOM 501 NB2 GLN 962 

35 ATOM 502 HB21 GLN 962 

ATOM 503 HB22 GLN 962 

ATOM 504 C GLN 962 

ATOM 505 O GLN 962 

ATOM 506 N ASN 963 

40 ATOM 507 H ASN 963 

ATOM 508 CA ASN 963 

ATOM 509 CB ASN 963 

ATOM 510 CG ASN 963 

ATOM 511 OD1 ASN 963 

45 ATOM 512 ND2 ASN 963 

ATOM 513 HD21 ASN 963 

ATOM 514 HD22 ASN 963 

ATOM 515 C ASN 963 

ATOM 516 O ASN 963 

50 ATOM 517 N LYS 964 

ATOM 518 H LYS 964 

ATOM 519 CA LYS 964 

ATOM 520 CB LYS 964 

ATOM 521 CG LYS 964 

55 ATOM 522 CD LYS 964 

ATOM 523 CE LYS 964 

ATOM 524 NZ LYS 964 

ATOM 525 HZ1 LYS 964 

ATOM 526 HZ2 LYS 964 



0.622 12.078 S.371 1.00 20.81 

-0.355 11.449 4.383 1.00 21.87 

1.042 15.493 7.163 1.00 16.99 

1.661 16.374 6.564 1.00 18.92 

0.179 15.765 8.137 1.00 16.32 

-0.303 15.030 8.563 1.00 10.00 

-0.103 17.142 8.534 1.00 16.86 

-1.165 17.190 9.660 1.00 18.03 

-2.384 16.597 9.197 1.00 18.57 

-3.070 16.619 9.863 1.00 10.00 

-1.438 18.618 10.082 1.00 18.31 

1.180 17.824 9.026 1.00 16.90 

1.465 18.974 8.683 1.00 16.92 

1.955 17.105 9.830 1.00 16.15 

1.713 16.175 10.007 1.00 10.00 

3.197 17.646 10.364 1.00 15.68 

3.689 16.799 11.532 1.00 15.19 

2.758 16.807 12.706 1.00 14.80 

1.715 17.624 12.994 1.00 14.08 

2.834 15.909 13.747 1.00 14.93 

3.301 15.070 13.827 1.00 10.00 

1.886 16.172 14.625 1.00 16.16 

1.191 17.207 14.192 1.00 14.72 

0.422 17.616 14.676 1.00 10.00 

4.248 17.788 9.268 1.00 15.94 

5.020 18.750 9.272 1.00 14.39 

4.242 16.853 8.316 1.00 16.90 

3.608 16.106 8.356 1.00 10.00 

5.167 16.895 7.185 1.00 16.19 

4.959 15.694 6.254 1.00 15.77 

5.490 14.367 6.780 1.00 16.87 

5.269 13.218 5.802 1.00 17.54 

4.677 13.396 4.737 1.00 18.11 

5.743 12.036 6.163 1.00 17.65 

6.214 11.964 7.022 1.00 10.00 

5.602 11.2B8 5.549 1.00 10.00 

4.903 18.173 6.401 1.00 15.74 

5.832 18.B99 6.059 1.00 15.23 

3.625 18.463 6.168 1.00 15.23 

2.932 17.852 6.489 1.00 10.00 

3.232 19.649 5.415 1.00 16.00 

1.767 19.562 4.981 1.00 16.74 

1.567 18.605 3.821 1.00 20.62 

2.379 18.560 2.895 1.00 24.90 

0.510 17.811 3.879 1.00 22.06 

-0.085 17.835 4.647 1.00 0.00 

0.384 17.209 3.108 1.00 0.00 

3.513 20.955 6.129 1.00 14.80 

3.907 21.925 5.495 1.00 13.94 

3.344 20.978 7.448 1.00 15.34 

3.023 20.172 7.901 1.00 10.00 

3.618 22.191 8.208 1.00 16.38 

3.254 22.013 9.686 1.00 16.89 

3.415 23.289 10.501 1.00 20.93 

2.888 23.125 11.911 1.00 28.63 

1.917 24.244 12.280 1.00 31.73 

2.519 25.607 12.214 1.00 33.69 

2.843 25.813 11.248 1.00 10.00 

3.314 25.670 12.878 1.00 10.00 
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ATOM 


527 


HZ3 


LYS 


964 


1.798 


26.313 


12.479 


1 


. 00 


10.00 


ATOM 


528 


C 


LYS 


964 


5.103 


22.514 


8.078 


1 


.00 


16.37 


ATOM 


529 


O 


LYS 


964 


5.487 


23.662 


7.844 


1 


.00 


17.92 


ATOM 


530 


N 


ILE 


965 


5.930 


21.481 


8.199 


1 


.00 


16.65 


ATOM 


531 


H 


I LB 


965 


5.562 


20.586 


8.353 


1 


.00 


10.00 


ATOM 


532 


CA 


ILB 


965 


7.374 


21.629 


8.099 


1 


.00 


14 .00 


ATOM 


533 


CB 


ILE 


965 


8.088 


20.346 


8.560 


1 


.00 


12.06 


ATOM 


534 


CG2 


ILE 


965 


9.560 


20.369 


8.151 


1 


.00 


11.41 


ATOM 


535 


CGI 


ILE 


965 


7.934 


20.202 


10.076 


1 


.00 


10.00 


ATOM 


536 


CD1 


ILB 


965 


8.302 


18.849 


10.602 


1 


.00 


10.00 


ATOM 


537 


C 


ILE 


965 


7.800 


22.012 


6.687 


1 


.00 


13.95 


ATOM 


538 


0 


ILB 


965 


8.574 


22.951 


6.506 


1.00 


15.38 


ATOM 


539 


N 


LEU 


966 


7.274 


21.310 


5.691 


1.00 


13.46 


ATOM 


540 


H 


LEU 


966 


6.650 


20.584 


5.886 


1 


.00 


10.00 


ATOM 


541 


CA 


LEU 


966 


7.609 


21.605 


4.304 


1 


.00 


15.13 


ATOM 


542 


CB 


LEU 


966 


7.023 


20.542 


3.372 


1 


.00 


15 .28 


ATOM 


543 


CG 


LEU 


966 


7.738 


19.187 


3.463 


1 


.00 


14 . 45 


ATOM 


544 


GDI 


LEU 


966 


6.910 


18.105 


2.791 


1 


.00 


14 . 55 


ATOM 


545 


CD2 


LEU 


966 


9.123 


19.285 


2.841 


1 


.00 


12.40 


ATOM 


546 


C 


LEU 


966 


7.X73 


23 . 015 


3 . 886 


1 


.00 


15 . 31 


ATOM 


547 


O 


LEU 


966 


7 . 909 


23 . 712 


3 . 181 


1 


.00 


13 . 98 


ATOM 


548 


N 


SER 


967 


6 .000 


23 .446 


4 . 345 


1 


.00 


15.58 


ATOM 


549 


H 


SER 


967 


5.435 


22.863 


4 . B89 


1 


.00 


10. 00 


ATOM 


550 


CA 


SER 


967 


5.506 


24.780 


4 . 029 


1 


.00 


15^48 


ATOM 


551 


CB 


SER 


967 


4 . 073 


24 .974 


4 .537 


1 


.00 


16 . 99 


ATOM 


552 


OG 


SER 


967 


3 .142 


24 .292 


3.715 


1. 


.00 


19.67 


ATOM 


553 


HG 


SER 


967 


2 .272 


24 .346 


4 . 118 


1 


.00 


10. 00 


ATOM 


554 


C 


SER 


967 


6.427 


25.824 


4 .643 


1 


.00 


16 . 18 


ATOM 


555 


O 


SER 


967 


6 . 725 


26 .839 


4 . 017 


1 


.00 


16 . 88 


ATOM 


556 


N 


SER 


968 


6.892 


25 .554 


5.859 


1. 


.00 


16.46 


ATOM 


557 


H 


SER 


968 


6.624 


24 .721 


6.303 


1. 


.00 


10. 00 


ATOM 


558 


CA 


SER 


968 


7.789 


26.459 


6 .570 


1. 


.00 


17.49 


ATOM 


559 


CB 


SER 


368 


8.024 


25.947 


7.995 


1. 


.00 


17.79 


ATOM 


560 


OG 


SER 


968 


8.766 


26 .871 


8 .774 


1, 


.00 


16.89 


ATOM 


561 


HG 


SBR 


968 


8.248 


27.676 


8.828 


1. 


00 


10.00 


ATOM 


562 


C 


SER 


968 


9.114 


26.562 


5.815 


1. 


00 


18.08 


ATOM 


563 


O 


SER 


96 B 


9.661 


27.654 


5.650 


1. 


00 


18.78 


ATOM 


564 


N 


VAL 


969 


9.621 


25.418 


5.360 


1.00 


17.77 


ATOM 


565 


H 


VAL 


969 


9.145 


24.581 


5.543 


1 . 


00 


10.00 


ATOM 


566 


CA 


VAL 


969 


10.863 


25.362 


4.598 


1.00 


16.32 


ATOM 


567 


CB 


VAL 


969 


11.201 


23.892 


4.189 


1 . 


00 


15.01 


ATOM 


568 


CGI 


VAL 


969 


12.232 


23.855 


3.065 


1. 


00 


14 .69 


ATOM 


569 


CG2 


VAL 


969 


11.729 


23.123 


5.390 


1 - 


00 


12.25 


ATOM 


570 


C 


VAL 


969 


10.720 


26 .256 


3.360 


1 . 


00 


16.88 


ATOM 


571 


O 


VAL 


969 


11.597 


27.072 


3.071 


1. 


00 


18.31 


ATOM 


572 


N 


GLN 


970 


9.590 


26.140 


2.666 


1. 


00 


17.34 


ATOM 


573 


H 


GLN 


970 


8.915 


25.497 


2.971 


1. 


00 


10.00 


ATOM 


574 


CA 


GLN 


970 


9.334 


26.945 


1.475 


1. 


00 


18.13 


ATOM 


575 


CB 


GLN 


970 


7.977 


26.588 


0.859 


1. 


00 


17.14 


ATOM 


576 


CG 


GLN 


970 


7.886 


25.155 


0.350 


1. 


00 


18.59 


ATOM 


577 


CD 


GLN 


970 


6.520 


24 .809 


-0.215 


1. 


00 


20.55 


ATOM 


578 


OBI 


GLN 


970 


6.417 


24.141 


-1.237 


1. 


00 


23.07 


ATOM 


579 


NE2 


GLN 


970 


5.466 


25.251 


0.453 


1. 


00 


21.52 


ATOM 


580 


HE 21 


GLN 


970 


5.590 


25.769 


1.273 


1. 


00 


10.00 


ATOM 


581 


HB22 


GLN 


970 


4.591 


25.025 


0.074 


1. 


00 


10.00 


ATOM 


582 


C 


GLN 


970 


9.391 


28.442 


1.792 


1. 


00 


19.00 


ATOM 


583 


O 


GLN 


970 


9.998 


29.216 


1.047 


1.00 


19.59 


ATOM 


584 


N 


ALA 


971 


8.793 


28.832 


2.915 


1. 


00 


18.42 


ATOM 


585 


H 


ALA 


971 


8.345 


28.158 


3.469 


1. 


00 


10.00 
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55 



ATOM 


586 


CA 


ALA 


971 


8.772 


30.227 


3.351 


1 


.00 18.65 


ATOM 


587 


CB 


ALA 


971 


7.797 


30.397 


4.510 


1 


.00 16.83 


ATOM 


588 


C 


ALA 


971 


10.160 


30.733 


3.748 


1 


.00 19.41 


ATOM 


589 


O 


ALA 


971 


10.512 


31.884 


3.477 


1 


.00 20.57 


ATOM 


590 


N 


MET 


972 


10.945 


29.880 


4.399 


1 


.00 20.08 


ATOM 


5 91 


H 


MET 


972 


10.613 


28.976 


4.582 


1 


.00 10.00 


ATOM 


592 


CA 


MET 


972 


12.283 


30.263 


4.819 


1 


.00 20.97 


ATOM 


593 


CB 


MET 


972 


12.876 


29.248 


5.788 


1 


.00 23.16 


ATOM 


594 


CG 


MET 


972 


12.260 


29.289 


7.174 


1 


.00 25.68 


ATOM 


595 


SD 


MET 


972 


13.275 


28.393 


8.360 


1 


.00 29.72 


ATOM 


596 


CB 


MET 


972 


13.134 


26.696 


7.733 


1 


.00 28.75 


ATOM 


597 


C 


MET 


972 


13.199 


30.431 


3.620 


1 


.00 23.17 


ATOM 


598 


O 


MET 


972 


14 .072 


31.297 


3.624 


1 


.00 22.87 


ATOM 


599 


N 


ARG 


973 


13.002 


29.603 


2.597 


1 


.00 22.37 


ATOM 


600 


H 


ARC 


973 


12.308 


28.911 


2.661 


1 


.00 10.00 


ATOM 


601 


CA 


ARG 


973 


13.803 


29.690 


1 . 382 


1 


.00 22.73 


ATOM 


602 


CB 


ARG 


973 


13.513 


28 .507 


0.458 


1 


.00 19.60 


ATOM 


603 


CG 


ARG 


973 


14.116 


27.209 


0. 929 


1 


.00 18.92 


ATOM 


604 


CD 


ARG 


973 


13.681 


26 .058 


0. 056 


1 


.00 19.57 


ATOM 


605 


NE 


ARG 


973 


13-960 


26 .318 


-1.353 


1 


.00 23.42 


ATOM 


606 


HE 


ARG 


973 


13 .341 


26 .895 


-1 . 845 


1 


.00 10. 00 


ATOM 


607 


CZ 


ARG 


973 


14.994 


25 .817 


-2.020 


1 


.00 22.59 


ATOM 


608 


NH1 


ARG 


973 


15. 862 


25.015 


-1 .414 


1 


.00 23.50 


ATOM 


609 


HH11 


ARG 


973 


15-747 


24 .775 


-0.452 


1 


.00 0.00 


ATOM 


610 


HH12 


ARG 


973 


16.628 


24 .635 


-1.935 


1.00 0.00 


ATOM 


611 


NH2 


ARG 


973 


15.167 


26 .135 


-3.295 


1 


.00 22.09 


ATOM 


612 


HH21 


ARG 


973 


14.515 


26.744 


-3.751 


1 


.00 0.00 


ATOM 


613 


HH22 


ARG 


973 


15.933 


25.750 


-3 .811 


1 


.00 0.00 


ATOM 


614 


C 


ARG 


973 


13.507 


31.010 


0.670 


1 


.00 24.33 


ATOM 


615 


o 


ARG 


973 


14.432 


31.730 


0.282 


1, 


.00 22.76 


ATOM 


616 


N 


THR 


974 


12.220 


31.332 


0.527 


1 


.00 26.94 


ATOM 


617 


H 


THR 


974 


11.515 


30.729 


0.852 


1 


.00 10.00 


ATOM 


618 


CA 


THR 


974 


11.795 


32 .576 


-0.113 


1 . 


.00 31.31 


ATOM 


619 


CB 


THR 


974 


10.264 


32.669 


-0.193 


1 . 


.00 30.78 


ATOM 


620 


OG1 


THR 


974 


9.750 


31.491 


-0.822 


1 . 


00 31.97 


ATOM 


621 


HG1 


THR 


974 


10.092 


31.425 


-1.714 


1. 


00 10.00 


ATOM 


622 


CG2 


THR 


974 


9.842 


33.884 


-1.005 


1. 


00 32.02 


ATOM 


623 


C 


THR 


974 


12.325 


33.7B3 


0.671 


1 . 


00 3S.46 


ATOM 


624 


O 


THR 


974 


12.720 


34 .792 


0.081 


1 . 


00 36.00 


ATOM 


625 


N 


GLN 


975 


12.364 


33.655 


1.996 


1. 


00 39.13 


ATOM 


626 


H 


GLN 


975 


12.031 


32.830 


2.414 


1. 


00 10.00 


ATOM 


627 


CA 


GLN 


975 


12.858 


34.715 


2.869 


1 . 


00 42.57 


ATOM 


628 


CB 


GLN 


975 


12.586 


34.359 


4.334 


1. 


00 45.95 


ATOM 


629 


CG 


GLN 


975 


12.793 


35.519 


5.300 


1 . 


00 51.94 


ATOM 


630 


CD 


GLN 


975 


12.891 


35.080 


6.751 


1 . 


00 54.33 


ATOM 


631 


OBI 


GLN 


975 


12.227 


34.130 


7.180 


1 . 


00 54.56 


ATOM 


632 


NE2 


GLN 


975 


13.728 


35.773 


7.517 


1. 


00 55.74 


ATOM 


633 


RE21 


GLN 


975 


14.227 


36.509 


7.105 


1. 


00 10.00 


ATOM 


634 : 


HB22 


GLN 


975 


13.811 


35.543 


8.462 


1. 


00 10.00 


ATOM 


635 


C 


GLN 


975 


14.361 


34.917 


2.660 


1. 


00 42.58 


ATOM 


636 


O 


GLN 


975 


14.862 


36.043 


2.706 


1. 


00 43.50 


ATOM 


637 


N 


MET 


976 


15.072 


33.818 


2.424 


1. 


00 42.59 


ATOM 


638 


H 


MET 


976 


14.606 


32.954 


2.412 


1 . 


00 10.00 


ATOM 


639 


CA 


MET 


976 


16.513 


33.B60 


2.204 


1 . 


00 42.79 


ATOM 


640 


CB 


MET 


976 


17.129 


32.480 


2.416 


1 . 


00 41.79 


ATOM 


641 


CG 


MET - 


976 


16.916 


31.925 


3.810 


1. 


00 42.17 


ATOM 


642 


SD 


MET 


976 


17.517 


33.018 


5.102 


1 . 


00 44.85 


ATOM 


643 


CE 


MET 


976 


16 .022 


33.352 


6.019 


1. 


00 42.12 


ATOM 


644 


C 


MET 


976 


16.892 


34.408 


0.827 


1. 


00 43.71 
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ATOM 


645 


O 


MET 


976 


18.016 


34.871 


0.631 


1 


.00 44.60 


ATOM 


646 


K 


GLN 


977 


15.975 


34.327 


-0.136 


1 


.00 43.94 


ATOM 


647 


H 


GLN 


977 


15.105 


33.913 


0-057 


1 


.00 10.00 


ATOM 


648 


CA 


GLN 


977 


16.235 


34.859 


-1.472 


1 


.00 45.08 


ATOM 


649 


CB 


GLN 


977 


15.122 


34.482 


-2.449 


1 


.00 46.20 


ATOM 


650 


CG 


GLN 


977 


15.070 


33.029 


-2.832 


1 


.00 49.33 


ATOM 


651 


CD 


GLN 


977 


14.291 


32.813 


-4.112 


1 


.00 52.53 


ATOM 


652 


OE1 


GLN 


977 


14.742 


33.196 


-5.195 


1 


.00 52.71 


ATOM 


653 


NB2 


GLN 


977 


13.119 


32.198 


-4.000 


1 


.00 53.68 


ATOM 


654 


HE21 


GLN 


977 


12.811 


31.913 


-3.116 


1 


.00 10.00 


ATOM 


655 


HB22 


GLN 


977 


12.622 


32.067 


-4 .836 


1 


.00 10.00 


ATOM 


656 


C 


GLN 


977 


16.282 


36.375 


-1.370 


1 


.00 45.82 


ATOM 


657 


O 


GLN 


977 


17.046 


37.037 


-2.070 


1 


.00 45.80 


ATOM 


658 


N 


GLN 


978 


15.453 


36.903 


-0.475 


1 


.00 47.73 


ATOM 


659 


H 


GLN 


978 


14.872 


36.302 


0.037 


1 


.00 10.00 


ATOM 


660 


CA 


GLN 


978 


15.337 


38.333 


-0.224 


1 


.00 48.95 


ATOM 


661 


CB 


GLN 


978 


14.007 


3B.606 


0.482 


1 


.00 50.40 


ATOM 


662 


CG 


GLN 


978 


12.800 


36.062 


-0.268 


1.00 51.80 


ATOM 


663 


CD 


GLN 


978 


11.535 


37.947 


0.556 


1 


.00 54.30 


ATOM 


664 


OE1 


GLN 


978 


10.439 


37.770 


0.021 


1 


.00 53.56 


ATOM 


665 


NE2 


GLN 


978 


11.683 


38.018 


1.877 


1 


.00 54.22 


ATOM 


666 


HE21 


GLN 


978 


12.559 


38.123 


2.294 


1 


.00 10.00 


ATOM 


667 


HE22 


GLN 


978 


10.843 


37.960 


2.381 


1.00 10.00 


ATOM 


668 


C 


GLN 


978 


16.511 


36.860 


0.604 


1 


.00 49.37 


ATOM 


669 


O 


GLN 


978 


16.656 


40.068 


0.792 


1.00 51.47 


ATOM 


670 


N 


MET 


979 


17.361 


37.949 


1.070 


1 


.00 49.20 


ATOM 


671 


H 


MET 


979 


17.205 


37.003 


0.891 


1 


.00 10.00 


ATOM 


672 


CA 


MET 


979 


18.532 


38.309 


1.863 


1. 


.00 49.49 


ATOM 


673 


CB 


MET 


979 


18.939 


37.138 


2.767 


1. 


.00 50.98 


ATOM 


674 


CG 


MET 


979 


19.641 


37.532 


4.064 


1. 


.00 52.98 


ATOM 


675 


SD 


MET 


979 


18.533 


38.331 


5.249 


1. 


00 56.67 


ATOM 


676 


CE 


MET 


979 


17.092 


37.225 


5.208 


1. 


00 55.34 


ATOM 


677 


C 


MET 


979 


19.702 


38.672 


0. 941 


1 . 


00 48.66 


ATOM 


678 


O 


MET 


979 


20.842 


38.788 


1.392 


1. 


00 46.62 


ATOM 


679 


N 


HIS 


980 


19.424 


38.802 


-0.356 


1.00 46.50 


ATOM 


680 


H 


BIS 


980 


18.519 


38.655 


-0.702 


1.00 10.00 


ATOM 


681 


CA 


HIS 


980 


20.454 


39.155 


-1.319 


1. 


00 44.17 


ATOM 


682 


CB 


HIS 


980 


21.468 


38.018 


-1.487 


1. 


00 39.73 


ATOM 


683 


CG 


HIS 


980 


20.883 


36.743 


-2.002 


1. 


00 37.33 


ATOM 


684 


CD2 


HIS 


980 


20.507 


35.611 


-1.360 


1. 


00 34.61 


ATOM 


685 


ND1 


HIS 


980 


20.667 


36.504 


-3.343 


1. 


00 34.47 


ATOM 


686 


HD1 


HIS 


980 


20.768 


37.181 


-4.053 


1. 


00 10.00 


ATOM 


687 


CE1 


HIS 


980 


20.192 


35.286 


-3.505 


1. 


00 30.99 


ATOM 


688 


NE2 


HIS 


980 


20.087 


34.722 


-2.315 


1. 


00 32.54 


ATOM 


689 


HE2 


HIS 


980 


19.8B0 


33.801 


-2.092 


1. 


00 10.00 


ATOM 


690 


C 


HIS 


980 


19.910 


39.599 


-2.676 


1 . 


00 45.06 


ATOM 


6 91 


O 


HIS 


980 


20.126 


38.942 


-3.695 


1. 


00 46.18 


ATOM 


692 


N 


GLY 


981 


19.171 


40.703 


-2.668 


1. 


00 46.12 


ATOM 


693 


H 


GLY 


981 


18.996 


41.168 


-1.838 


1.00 10.00 


ATOM 


694 


CA 


GLY 


981 


18.623 


41.249 


-3.900 


1. 


00 46.12 


ATOM 


695 


C 


GLY 


981 


19.611 


42.222 


-4.526 


1- 


00 46.17 


ATOM 


696 


O 


GLY 


981 


19.297 


42.809 


-5.583 


1.00 45.93 


ATOM 


697 


OT 


GLY 


981 


20.710 


42.404 


-3.954 


1. 


00 47.24 


ATOM 


698 


OH2 


TIP3 


1 


5.348 


20.105 


18.757 


1. 


00 13.76 


SOLV 




















ATOM 


699 


OH2 


TIP3 


2 


-1.607 


18.S97 


5.643 


1. 


00 10.67 


SOLV 




















ATOM 


700 


OH2 


TIP3 


3 


11.575 


6.309 


4.064 


1. 


00 22.41 


SOLV 



















55 



WO 00/37500 



10 



15 



20 



25 



30 



35 



AO 



45 



50 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 
ATOM 
SOLV 



701 
702 
703 
704 
705 
706 
707 
708 
709 
710 
711 
712 
713 
714 
715 
716 
717 
718 
719 
720 
721 
722 
723 
724 
725 
726 
727 
72B 
729 



0H2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 T1P3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 T1P3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 



4 

5 
6 
7 
8 
9 
10 
12 
13 
14 
15 
16 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
35 



-42- 
-1.390 
10.169 
22.240 

4.944 

9.075 
20.954 
11.237 

2.003 
22.341 
20.229 

9.002 
-1.539 
17.039 

1.972 

2.846 
23.32B 
20.264 
-1.623 

7.117 
12.549 
18.509 
-1.105 
13.308 
20.963 
11.976 
15.358 

7.138 
18.565 
10.191 



17.519 
12.836 
14.887 
28.525 
16.556 
26.572 
23.923 
11.736 
26.794 

8.193 

9.880 
23.366 
21.091 
14.588 
13.524 
11.454 
11.491 
13.755 

7.426 
12.170 
15.812 
20.234 
28.872 
24.454 
15.552 

9.999 
31.193 
19.866 
11.998 



15.624 
-1.287 
7.379 
2.399 
-1.049 
12.360 
22.404 
0.983 
8.789 
9.431 
15.816 
11.757 
18.454 
0.930 
16.999 
15.152 
3 .039 
9.215 
3.341 
21.440 
3.112 
15.360 
-3.398 
14.891 
20.454 
2.685 
-0.433 
15.827 
19.068 



PCT/CA 99/0 1209 

1.00 26.01 
1.00 21.06 
1.00 23.85 
1.00 23.14 
1.00 20.96 
1.00 24.79 
1.00 26.34 
1.00 32.92 
1.00 30.68 
1.00 32.28 
1.00 43.59 
1.00 34.88 
1.00 24.01 
1.00 37.49 
1.00 38.34 
1.00 32.30 
1.00 37.02 
1.00 24.03 
1.00 39.59 
1.00 39.58 
1.00 19.84 
1.00 54.40 
1.00 31.59 
1.00 29.47 
1.00 22.32 
1.00 25.88 
1.00 34.94 
1.00 31.14 
1.00 36.35 



55 



WO 00/37500 



10 



15 



20 



25 



30 



35 



40 



10 



15 



20 



25 



30 



35 



40 



45 



ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

ATOM 

SOLV 

END 



730 

731 

732 

733 

734 

735 

736 

737 

738 

739 

740 

741 

742 

743 

744 

745 

746 

747 

748 

74 9 

750 



OH2 TIP 3 
OH2 TIP3 
OH2 TIP3 
0H2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
0H2 TIP3 
OH2 TIP 3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 
OH2 TIP3 



36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 

48 

49 

50 

51 

53 

54 

55 

56 

57 



• 43. 
-1.668 
24.133 
8.142 
16 . 767 
-0.649 
20.560 
24.262 
8.129 
4.176 
4.329 
19.760 
4.718 
7.659 
2.827 
5.873 
22.281 
8.311 
23.900 
23.435 
16.616 
10.916 



14.793 
32.437 
29.716 
15.194 
26.330 
21.578 
18.640 
9.379 
17.058 
26.311 
18.168 
28.777 
9.629 
14.980 
25.576 
25-625 
41.084 
11.535 
27.423 
38.557 
5.891 



13.334 
15 . 070 
8.516 
-0.104 
15.687 
16.917 
8.488 
1.097 
22.386 
8.281 
3.397 
6.740 
-3.363 
21.015 
-4.108 
17.414 
2.721 
10.030 
11.489 
-4.772 
-0.754 



PCT/CA99/01209 

1.00 32.50 

1.00 58.18 

1.00 38.02 

1.00 22.13 

1.00 40.04 

1.00 41.45 

1.00 46.62 

1.00 38.14 

1.00 43.87 

1.00 32.81 

1.00 26.43 

1.00 27.69 
1.00 41.77 
1.00 38.06 
1.00 58.77 
1.00 46.99 
1.00 41.45 
1.00 28.69 
l.OO 40.75 
1.00 41.46 
1.00 26.82 



45 



50 



55 



10 



25 



WO 00/37S00 PCT/CA99/0 1209 

-44- 

FULL CITATIONS FOR REFERENCES REFERRED TO IN THE SPECIFICATION 

1. Schuhz, J., Panting, CP., Hofmann, K. & Bork, P. SAM as a protein interaction domain 
involved in developmental regulation Protein Sci 6, 249-53 (1 997). . 
5 2. Jousset, C, et al. A domain of TEL conserved in a subset of ETS proteins defines a specific 
oligomerization interface essential to the mitogenic properties of the TEL-PDGFR beta oncoprotein 
EmboJ 16, 69-82 (1997). 

3. Peterson, A J, et al. A domain shared by the Poiycomb group proteins Scm and ph mediates 
15 heterotypic and homotypic interactions Mol Cell Biol 17, 6683-92 (1997). 

10 4. Tu, H., Barr, M., Dong, D.L. & Wigler, M. Multiple regulatory domains on the Byr2 protein 
kinase Mol Cell Biol 17, 5876-87 (1997). 

5. Kyba, M. & Brock, H.W. The SAM domain of polyhomeotic, RAE28, and scm mediates 
20 specific interactions through conserved residues Dev Genet 22, 74-84 (1998). 

6. Gomb, TJt„ Barker, G.F., Lovett, M. & Gilliland, D.G. Fusion of PDGF receptor beta to a 
1 5 novel ets-like gene, teL in chronic myelomonocytic leukemia with t(5;12) chromosomal translocation 

Cell 77, 307-16 (1994). 

7. Gohib, TJ*„ et al Oiigomerizairon of the ABL tyrosine kinase by the Ets protein TEL in 
human leukemia Mol Cell Biol 16, 4107-16 (1996V 

8. Lacronique, V, et at. A TEL-JAK2 fusion protein with constitutive kinase activity in human 
20 leukemia Science 278, 1 309-12 (1997). 

9. Golub, TJt, et al Fusion of the TEL gene on I2pl3 to the AML1 gene on 2Iq22 in acute 
30 lymphoblastic leukemia Proc Natl Acad Sci V SA 92, 491 7-21(1 995). 

10. Henkemeyer, M, et al Nuk controls pathfinding of commissural axons in the mammalian 
central nervous system Cell 86, 35-46 ( 1 996). 

25 11. Orioli, D., Henkemeyer, M., Lemke, G., Klein, R. & Pawson, T. Sek4 and Nuk receptors 
35 cooperate in guidance of commissural axons and m palate formation Embo J 1 5, 6035-49 ( 1 996). 

12. Krull, C£., et al. Interactions of Eph-rclated receptors and ligands confer rostrocaudal 
pattern to trunk neural crest migration Curr Biol 7, 571-80 ( 1 997). 

13. Xu, Q., Alldus, G, Macdonald, R., Wilkinson, D.G. & Holder, N. Function of the Eph- 
4Q 30 related kinase rtk I in patterning of the zebrafish forebrain Nature 381 , 3 1 9-22 (1996). * 

14. Wang, H.U., Chen, 2.F. & Anderson, DJ. Molecular distinction and angiogenic interaction 
between embryonic arteries and veins revealed by ephrin-B2 and its receptor Eph-B4 Cell 93, 741-53 
(1998). 

1 5. Stein, E„ et al Eph receptors discriminate specific ligand oligomers to determine alternative 
35 signaling complexes, attachment and assembly responses Genes Dev 12, 667-78 ( 1 998). 

16. Lackmann, M.„ et at. Ligand for EPH-related kinase (LERK) 7 is the preferred high affinity 
ligand for the HEK receptor J Biol Ckem 272, 16521-30 (1997). 

17. Davis, S., et al. Ligands for EPH-related receptor tyrosine kinases that require membrane 
50 attachment or clustering for activity Science 266, 8 1 6-9 (1994). 



45 



55 



10 



WO 00/37500 PCT/CA99/01209 

-45- 

1 8. Lackmann, M.. ct al Distinct subdomains of the EphA3 receptor mediate ligand binding and 
receptor dimerization J Biol Cheat 273, 20228-37 ( 1 998). 

1 9. Hock, B v et a/. PDZ-domain-m rtfiated interaction of the eph-relaied receptor tyrosine k inase 
EphB3 and the ras-binding protem AF6 depends on the kinase activity of the receptor Proc Natl Acad 

5 Sci USA 95, 9779-84 (1998). 

20. Ponting, C-P. SAM: a novel motif in yeast sterile and Drosophila poryhomeotic proteins 
Protem Sci 4, 1928-30 (!995). 

21 . Klambt, C. The Drosophila gene pointed encodes two ETS-Iikc proteins which are involved 
15 in the development of the midline glial cells Development 1 1 7, 1 63-76 ( 1 993). 

10 22. Sena-Pages, C, Medley, Q.G., Tang, M., Hart, A. & Streuli, M. Liprins, a family of LAR 
transmembrane protein- tyrosine phosphatase- interacting proteins J Biol Chem 273, 1 561 1-20 (1998). 

23. Kuriyan, J. & Cowbum, D. Modular Peptide Binding Domains Anrm. Rev. Biopkys. Biomot. 
20 ^cl 26, 259-288 (1997). 

24. Stein, E, Cerretti, D.P. & Daniel, T.O. Ligand activation of ELK receptor tyrosine kinase 
1 5 promotes its association with Grb 1 0 and Grt>2 in vascular endothelial cells J Biol Chem 271, 23588- 

93(1996> 

25. Terwilliger, T.C., Kim, SJL & Eisenberg, D. Generalized method of determining heavy- 
atom positions using the difference Patterson function Acta CrytaJlogr. Sect. A 43, 1-5 (1987). 

26. Furey, W. & Swarainathan, S. PHASES in American Crystallography Association Meeting 
20 ^n*flc»73(1990). 

27. Jones, T.A., Zou, J.Y., Cowan, S.W. & Kjeldgaard, M. Improved methods for building 
protein models in electron density maps and the location of errors in these models Acta Crystallogr. 
A47, 110-119(1991). 

28. Brunger, A:T. (Yale University, New Haven, CT, 1992). 

25 29. Carson, M. Ribbons 2.0 Journal of applied Crystallography 24, 958 ( 1 991). 
35 30- Nicholls, A., Sharp, K.A. & Honig, B. Protein folding and association: insights from the 

interfacial and thermodynamic properties of hydrocarbons Proteins: Struct. Funct and Genetics II, 
281-2% (1991). 

40 



45 



25 



30 



50 



55 



Claims 



5 



10 



15 



20 



25 



30 



35 



40 



45 



55 



10 



25 



30 



WO CO/37500 PCT/CA99/0I209 

-46- 

WE CLAIM: 

I. A purified three dimensional structure of a polypeptide corresponding to one or more SAM 
domains. 



15 



5 2. A three dimensional structure as claimed in claim I, wherein the SAM domain is a SAM domain 
of an Eph receptor. 

3. A three dimensional structure as claimed in claim 2 wherein the Eph receptor is EphA. 

4. A three dimensional structure as claimed in claim 1 complex ed with one or more compounds. 
15 5 " A 1)1 dimensional structure as claimed in claim I comprising one or more heavy metal atoms. 

10 6. A purified crystalline form of a polypeptide corresponding to one or more SAM domains. 

7. A crystalline form as claimed in claim 6 having dimensions of about a=b= 77.14 ± .03 
angstroms, c= 24.3 * .04 angstroms. 
20 * A crystalline form as claimed in claim 7 having the co-ordinates set out in Table 2. 

9. A method of forming a crystalline form as claimed in claim 6 comprising 

(a) mixing a volume of a SAM domain with a reservoir solution; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution in a closed container 
under conditions suitable for crystallization. 

10. A method of determining three dimensional structures of polypeptides with SAM domains of 
unknown structure comprising the step of applying the structural atomic coordinates of a three 
dimensional structure as claimed in claim 1 or a crystalline form as claimed in claim 7 or 8. 

11 . A method for identifying a potential modulator of a SAM domain of an Eph receptor function 
comprising docking a computer representation of a structure of a compound with a computer 
representation of a structure of one or more SAM domains of an Eph receptor mat is defined by 
the atomic structural coordinates of the three dimensional structure as claimed in claim 2 or a 
crystalline form as claimed in claim 7 or 8. 

35 12. A method as claimed in claim 1 1 comprising the following steps: 

(a) docking a computer representation of a compound from a computer data base with a 
computer representation of a selected site on a three dimensional structure of a SAM domain 
of an Eph receptor as claimed in claim 2 or a crystalline form as claimed in claim 7 or 8 to 

4 q 30 obtain a complex; 

(b) determining a conformation of the complex with a favourable geometric fit and favourable 
complementary interactions; and 

(c) identifying compounds that best fit the selected site as potential modulators of SAM domain 
function. 

45 

35 13. A method as claimed in claim 1 1, comprising the following steps: 

(a) modifying a computer representation of a compound complexed with a selected site on 
a three dimensional structure of a SAM domain of an Eph receptor as claimed in claim 
2 or a crystalline form as claimed in claim 7 or 8. by deleting or adding a chemical 
50 group or groups; 
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(b) determining a conformation of the complex with a favourable geometric fit and 
favourable complementary interactions; and 

(c) identifying a compound that best fits the selected site as a potential modulator of a SAM 
domain. 

14. A method as claimed in claim 1 1 comprising the following steps: 

(a) selecting a computer representation of a compound complexed with a selected site on a three 
dimensional structure of a SAM domain of an Eph receptor as claimed in claim 2 or a 
crystalline form as claimed m claim 7 or 8; and 

(b) searching for molecules in a data base that are similar to the compound using a searching 
computer program, or replacing portions of the compound with similar chemical structures 
from a data base using a compound building computer program. 

15. A potential modulator of a function of a SAM domain of an Eph receptor identified by a method 
20 as claimed in any one of claims 1 1 to 14. 

16. A method of treating a disease associated with a SAM domain of an Eph receptor with 
1 5 inappropriate activity in a cellular organism, comprising: 

(a) administering a crystalline form of a polypeptide as claimed m claim 6 or a modulator 
identified using a method as claimed in any one of claims II to 14, in an acceptable 
pharmaceutical preparation; and 

(b) activating or inhibiting a SAM domain function to treat the disease. 

17. A method as claimed in claim 16 wherein the disease is a cell proliferative disease or disease 
associated with the nervous system. 

18. A peptide of the formula 1 which mediates SAM domain function: 



25 

20 

30 



x-x'-x'-x'-x'-x'-x* I 

25 

35 wherein X and X* represent 0 to 70, preferably 0 to 50 amino acids, more preferably 2 to 20 

amino acids, and X 1 represents Leu, Phe, Asp, Ala, Glu, or Gly, preferably Leu or Gly, X 2 
represents Glu, Asp, Ser, lie, Ala, Arg, Lys, and Gin, preferably Glu or Asp, X 3 represents Ala, 
Val, Glu, Phe, Ser, lie, Met, Leu, His, Gin, Arg, or Asp preferably Ala, Val, or Phe, X 4 is Val, 

40 30 ^ Mel » ***** lle » Preferably Val or Leu, or Phe, X 5 is Val, Ser, Leu, Asp, Ala, Pro, Asn, 

Lys, or Cys, preferably Val or Ser. 

19. A peptide as claimed in claim 18 wherein X represents TT, ID, TS, DD, GYTT (SEQ ID. NO. 
38X AAGYTT (SEQ ID. NO. 39X FTAAGYTT (SEQ ID. NO. 40), DNFTAAGYTT (SEQ ID. 
NO. 41), or YKDNFTAAGYTT (SEQ ID. NO. 42). 

20. A peptide as claimed b claim 18 wherein X* represents HM, HMSQ (SEQ ID. NO. 43), 
HMSQD (SEQ ID. NO. 44), HMSQDD (SEQ ID. NO. 45X HMSQDDLA (SEQ ID. NO. 46X 
QMMM (SEQ ID. NO. 47X QMMMED (SEQ ID. NO. 48X QMMMEDLL (SEQ ID. NO. 49X 
D1TE (SEQ ID. NO. 50), DITEED (SEQ ID. NO. 51), D1TEEDL (SEQ ID. NO. 52X NLTE 
(SEQ ID. NO. 53), NLTEND (SEQ ID. NO. 54), or NLTENDI (SEQ ID. NO. 55). 
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2 J. A peptide of the formula I as claimed in claim 18 which is LEAW (SEQ ID. NO. 56), 
TTLEAW (SEQ ID. NO. 57X LEAWHM (SEQ ID. NO. 58X LEAWHMSQ (SEQ ID. NO. 
59), LEAVVHMSQD (SEQ ID. NO. 60), LEAWHMSQDDL (SEQ ID. NO. 61), 
LEAWHMSQDDLAR (SEQ ID. NO. 62), TTLEAWHMS (SEQ ID. NO. 63X 
TTLEAWHMSQD (SEQ ID. NO. 64), TTLEAVVHMSQDDL (SEQ ID. NO. 65), 
TTLEAWHMSQDDLAR (SEQ ID. NO. 66), GYTTLEAVV (SEQ ID. NO. 67), 
GYTTLEAVVHMS (SEQ ID. NO. 68), GYTTLEAVVHMSQD (SEQ ID. NO. 69X 
GYTTLEAWHMSQDDL (SEQ ID. NO. 70), GYTTLEAVVHMSQDDLAR (SEQ ID. NO. 
7IX FDVVS (SEQ ID. NO. 72X FDWSQ (SEQ ID. NO. 73X FDWSQMM (SEQ ID. NO. 74X 
FDWSQMMME (SEQ ID. NO. 75X FDWSQMMMEDIL (SEQ ID. NO. 76), TSFDWS 
(SEQ ID. NO. 77), TSFDVVSQ (SEQ ID. NO. 78), TSFDWSQMM (SEQ ID. NO. 79X 
TSFDWSQMMME (SEQ ID NO. 80), TSFDV VSQM MMEDIL (SEQ ID. NO. 81), LEFLS 
(SEQ ID. NO. 82), LEFLSD (SEQ ID. NO. 83X LEFLSDIT (SEQ ID. NO. 84X LEFLSDITEE 
(SEQ ID. NO. 85). LEFLSDITEEDL (SEQ ID. NO. 86), DDLEFLS (SEQ ID. NO. 87X 
GWDDLEFLS (SEQ ID. NO. 88X DDLEFLSD (SEQ ID. NO. 89X DDLEFLSDIT (SEQ ID. 
NO. 90), DDLEFLSDITEE (SEQ ID. NO. 9IX DDLEFLSDITEEDL (SEQ ID. NO. 92), 
GARFL (SEQ ID. NO. 93X GARFLN (SEQ ID. NO. 94), GARFLNLT (SEQ ID. NO. 95X 
GARFLNLTEN (SEQ ID. NO. 96), and IDGARFL (SEQ ID. NO. 97X 

22. A peptide of the formula U which mediates SAM domain function: 

X 7 -X^X»-X J0 -X ,, -X ,2 -X ,3 -X ,4 -X ,5 -X"' II 

wherein X 7 and X 1 * represent 0 to 70, preferably 0 to 50 amino acids, more preferably 2 to 20 
amino acids, and X* represents Met, He, Ser, Leu, Asn, Phe, or Val, preferably Met, X* represents 
Arg, Ser, Lys, Met, Leu, Glu, Gm, or Asn, preferably Gin or Arg, X )0 represents Thr, Ala, Arg, 
Leu, Ser, Glu, Asp, Met, Lys, Gbi, or Gly, preferably Thr, Ate, or Glu, X" represents Gm, Ser, 
Glu, Leu, Phe, Asp, Thr, Arg, preferably Gm or Arg, X 12 represents Met, Ala, lie, Asn, Ser, Arg, 
Thr, Pro, Leo, Gin, VaL Lys, preferably Met or Arg, X" represents Gin, Asn, Pro, Sct, Tyr, Glu, 
Leu, Arg, or Lys, preferably Gin, Asn, or Arg, X 14 represents Gin, Ala, Pro, Asp, Leu, Lys, lie, 
Glu, Arg, or Asn, preferably Gin or He, and X" represents Met, He, Val, His, Ser, Arg, Lys, Phe, 
Cys, Ghi, Tyr, Ala, He. Trp, ot Leu. 

23. A peptide of the formula II as claimed in claim 22 wherein X 7 represents QA, QV, NK, SVQA 
(SEQ ID. NO. 9SX LSSVQA (SEQ ID. NO. 99X ILSSVQA (SEQ ID. NO. 100), NKILSSVQA 
(SEQ ID. NO. 101X HQNK ILSSVQA (SEQ ID. NO. I02X THQNKILSSVQA (SEQ ID. NO. 
103X EN IK (SEQ ID. NO. 104), SQEINX (SEQ ID. NO. 105), KLSQEINK (SEQ ID. NO. 
106X ILNSIQV (SEQ ID. NO. 107X or NS1QV (SEQ ID. NO. 108). 

24. A peptide of the formula 11 as claimed in claim 22 wherein X" is HG, QS, HGRM (SEQ ID. NO. 
109), HGRMVP (SEQ ID. NO. 1 10), QSVEV (SEQ ID. NO. 1 1 1), or TRKP (SEQ ID. NO. 1 12). 
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25. A peptide of the tbrmob 11 as claimed in claim 22 which is MRTQMQQM (SEQ ID. NO. 1 13), 
QAMRTQMQQM (SEQ ID. NO. \\4\ SVQAMRTQMQQM (SEQ ID. NO. I15X 
LSSVQAMRTQMQQM (SEQ ID. NO. 116), ILSSVQAMRTQMQQM (SEQ ID. NO. 117), 
MRTQMQQMHG (SEQ ID. NO. 118), MRTQMQQMHGRM (SEQ ID. NO. II9X 
MRTQMQQMHGRMVPV (SEQ ID. NO. 120), NEERRSIF (SEQ ID. NO. 121% 
INKNEERRSIF (SEQ ID. NO. I22X NEERRS1FTRKP (SEQ ID. NO. 123). MRAQMNQI 
(SEQ ID. NO. 124), MRAQMNQIQS (SEQ ID. NO. 125), MRAQMNQIQSVEV (SEQ ID. NO. 
126). 

26. A peptide which mediates SAM domain function comprising WSV (SEQ ID. NO. 2 IX 
SAWSV (SEQ ID. NO.22), FSAVV (SEQ ID. NO.23 ), FSAVVSV (SEQ ID. NO. 24X 
FSAWSVGD (SEQ ID. NO. 25X WSVGDWL (SEQ ID. NO. 26X FNTV (SEQ ID. NO. 27), 
FNTVDE (SEQ ID. NO. 28X FNTVDEWL (SEQ ID. NO. 29X TSFNTVDEWL (SEQ ID. NO. 

20 30), TSFNTV (SEQ ID. NO. 31), YTSFNTV (SEQ ID. NO. 32X RSEV (SEQ ID. NO. 33), 

RSEVLG (SEQ ID. NO. 34X RSEVLGVD (SEQ ID. NO. 35), VPFRSEV (SEQ ID. NO. 36), 
1 5 and VPFRSEVLG W (SEQ ID. NO. 37). 

27. A pharmaceutical composition comprising a peptide as claimed in any one of claims 1 8 to 26 and 
25 a pharmaceutical!).- acceptable carrier, dihient or excipient. 
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Sequence Listing 

SEQ. ID. NO. 1 
EPH A4 

PEFSAVVSVGDWLQAJKMDRYKDNFTAAGYTTL^ 
AMRTQMQQMHGRMVPV 

SEQ. ID. NO. 2 

EPHB2 

PDYTSFOTVDEWLEAJKMGQYKESFANAGFTSFDWSQMMN^ 
QVMRAQMNQIQSVEV 

SEQ. ID. NO. 3 

DGK-delta 

VHLVGTF^VAAWLEllLSLCEYKDIFTRHDIRGSELLHL^^ 
LSRSAPAVEA 

SEQ. ID. NO. 4 

SHIP2 

SGLGEAGMSAWIJ^AJGIXRYEEGLVHNGWDDLEFl^DITEEDLEEAGVQDPAHK^ 
LSK 

SEQ. ID. NO. 5 
RhoGAPpI22 

LTQIEAXEACDWU^ATGFPQYAQLYEDFLFPIDISLVKREHDFLDRDAIEALCRRLNTI^ 
MKLEISPHRKRS 

SEQ. ID. NO. 6 

Liprin al-SI 

QWDGPTVVVWI^WVGMPAWYVAACRANVKSGA^ 
QEIMSLTSPSAPPT 

SEQ. ID. NO. 7 

Liprin al-S2 

NIIEWIGNEWLPSLGLPQYRSYFMECLVDAKMIJDH^ 
LRRLN YDR KELE 

SEQ. ID. NO. 8 

Liprin al-S3 

VLVWSNDRVIRWILSIGLKYANmjGESGVHGALLAlJ)ETFDFSAiAlXLO 
EFNNLLVMGT 
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SEQ. ID. NO. 9 

Cortactin-BPl 

VHLWTKPDVADWlXSLNUJEHKETFMDNEnXSSHlPNLQKE^ 
QLLDR 

SEQ. ID. NO. 10 
Neurabin 

VHEWSVQQVSHWLVGI^DQYVSEFSAQOTSGEQLLQUXjNKiKJ^^ 
KEMKMSLEKARKAQ 

SEQ. ID. NO. 11 

SLP-76 

RNVPFRSEVLGWDPDSLADYFKKli^TYD 
LSQEINKNEERRSIFTRKP 

SEQ. ID. NO. 12 

Byr2p (S.pombc) 

MEYYTSKEVAEWLKSIGU^KYIF.QFSQ^ 
REFPRPCILRF 

SEQ. ID. NO. 13 

Ste4(S.pombe) 

YWNWNNEAVCNWIEQLGFPHKEAFEDY 
KKQKDKLQQE 

SEQ. ID. NO. 14 

Stell (S.cercvisiac) 

j^^^QLFl^IGCT^ 

SEQ. ID. NO. 15 
STE50(S. ccrevisiac) 

I^WSVDDVITWaSTLEVEETDPLCQRLREMMVGD 
MRDSKLEWKDDK 

SEQ. ID. NO. 16 

ETS-1 

PRQWTETHVRDWVWWAVNEKLKGVDFQKFCMNGAA^ 
LEILQKEDVKPYQVNG 

SEQ. ID. NO. 17 

FLI-I 
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PTLWTQEHVRQWlXWAIKF^SU^n)TSFFQNM 
SYLRESSLLAYNTT 

SEQ. ID. NO. JR 

TEL 

PrYTVSRDDVAQWLKWAENEFSIJUTOSNTFEM 
LKQRKPRILFSP 

SEQ. ID. NO. 19 

RAE28 

PSQWSVEEVYEFlASLQGCQEIAFXFRSQEIDGQAIXIXKEEHmSAMNI^ 
KET 

SEQ. ID. NO- 20 
Scm 

PIDWTIEEVIQYIESNDNSI^VHGDLFRKHEIDGKAIJJXNSE 
NKVNGRRNNLAL 

SEQ. ID. NO. 21 

VVSV 

SEQ ID. NO. 22 

SAWSV 

SEQID.NO.23 

FSAVV 

SEQ ID. NO.24 

FSAVVSV 

SEQ ID. NO. 25 

FSAVVSVGD 

SEQ ID. NO. 26 

VVSVGDWL 

SEQ ID. NO. 27 

FNTV 

SEQ ID. NO. 28 
FNTVDE 
SEQ ID. NO. 29 
FNTVDEWL 




WO 00/37500 4 

SEQ ID. NO. 30 

TSFNTVDEWL 

SEQ ID. NO. 31 

TSFNTV 

SEQ ID. NO- 32 

YTSFNTV 

SEQ ID. NO. 33 

RSEV 

SEQ ID. NO. 34 
RSEVLG 
SEQ ID. NO. 35 
RSEVLGWD 
SEQ ID. NO. 36 
VPFRSEV 
SEQ ID. NO. 37 
VPFRSEVLGW 
SEQ ID. NO. 38 
GYTT 

SEQ ID. NO. 39 
AAGYTT 
SEQ ID. NO. 40 
FT AAGYTT 
SEQ ID. NO. 41 
DNFT AAGYTT 
SEQ ID. NO. 42 
YKDNFT AAGYTT 
SEQ ID. NO. 43 
HMSQ 

SEQ ID. NO. 44 

HMSQD 

SEQ ID. NO. 45 

HMSQDD 

SEQ ID. NO. 46 

HMSQDDLA 

SEQ ID. NO. 47 
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QMMM 

SEQ ID. NO. 48 

QMMMED 

SEQ ID. NO. 49 

QMMMEDLL 

SEQ ID. NO. 50 

DUE 

SEQ ID. NO. 51 

DITEED 

SEQ ID. NO. 52 

DITEEDL 

SEQ E>. NO. 53 

NLTE 

SEQ ID. NO. 54 
NLTEND 
SEQ ID. NO. 55 
NLTENDI 
SEQ ID. NO. 56 
LEAVV 
SEQ ID. NO. 57 
TTLEAW 
SEQ ID. NO. 58 
LEAWHM 
SEQ ID. NO. 59 
LEAVVHMSQ 
SEQ ID. NO. 60 
LEAVVHMSQD 
SEQ ID. NO. 61 
LEAWHMSQDDL 
SEQ ID. NO. 62 
LEAWHMSQDDLAR 
SEQ ID. NO. 63 
TTLEAWHMS 
SEQ ID. NO. 64 
TTLEAVVHMSQD 
SEQ ID. NO. 65 
TTLEAVVHMSQDDL 
SEQ ID. NO. 66 
TTLEA V VI 1MSQDDLAR 
SEQ ID. NO. 67 
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GYTTLEAW 
SEQ ID. NO. 68 
GYTTLEAWHMS 
SEQ ID. NO. 69 

GYTTLEAVVHMSQD 
SEQ ID. NO. 70 
GYTTLEAVVHMSQDDL 
SEQID.NO.7l 
GYTTLEAWHMSQDDLAR 
SEQ ID. NO. 72 
FDVVS 

SEQ ID. NO. 73 
FDWSQ 
SEQ ID. NO. 74 
FDVVSQMM 
SEQ ID. NO. 75 
FDWSQMMME 
SEQ ID. NO. 76 
FDWSQMMMEDIL 
SEQ ID. NO. 77 
TSFDVVS 
SEQ ID. NO. 78 
TSFDVVSQ 
SEQ ID. NO. 79 
TSFDVVSQMM 
SEQ ID. NO. 80 
TSFDVVSQMM ME 
SEQ ID. NO. 81 
TSFDWSQMMMEDIL 
SEQ ID. NO. 82 
LEFLS 

SEQ ID. NO. 83 
LEFLSD 
SEQ ID. NO. 84 
LEFLSDIT 
SEQ ID. NO. 85 
LEFLSDITEE 
SEQ ID. NO. 86 
LEFLSD1TEEDL 
SEQ ID. NO. 87 
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DDLEFLS 
SEQ ID. NO. 88 
GWDDLEFLS 
SEQ ID. NO. 89 
DDLEFLSD 
SEQ ID. NO. 90 
DDLEFLSDIT 
SEQ ID. NO. 91 
DDLEFLS DITEE 
SEQ ID. NO. 92 
DDLEFLSDITEEDL 
SEQ ID. NO. 93 
GARFL 

SEQ ID. NO. 94 
GARFLN 
SEQ ID. NO. 95 
GARFLNLT 
SEQ ID. NO. 96 
GARFLNLTEN 
SEQ ID. NO. 97 
1DGARFL 
SEQ ID. NO. 98 
SVQA 

SEQ ID. NO. 99 
LSSVQA 
SEQ ID. NO. 100 
ILSSVQA 
SEQ ID. NO. 101 
NK1LSSVQA 
SEQ ID. NO. 102 
HQNKBLSSVQA 
SEQ ID. NO. 103 
THQNKJLSSVQA 
SEQ ID. NO. 104 
ENIK 

SEQ ID.NO. 105 
SQE1NK 
SEQ ID. NO. 106 
KLSQETNK 
SEQ ID. NO. 107 
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rLNSIQV 
SEQ ID. NO. 108 
NSIQV 

SEQ ID. NO. 109 
HGRM 

SEQ ID. NO. 1 10 
HGRMVP 
SEQ ID. NO. 1 1 1 
QSVEV 

SEQ ID. NO. 112 
TRKP 

SEQ ID. NO. 113 
MRTQMQQM 
SEQ ID. NO. 114 
QAMRTQMQQM 
SEQ ID. NO. 115 
SVQAMRTQMQQM 
SEQ ID. NO. 116 
LSSVQAMRTQMQQM 
SEQ ID. NO. 117 
ILSSV QAMRTQMQQM 
SEQ ED. NO. 118 
MRTQMQQMHG 
SEQ ID. NO. 1 19 
MRTQMQQMHGRM 
SEQ ID. NO. 120 
MRTQMQQMHGRMVPV 
SEQ ID. NO. 121 
NEERRS1F 
SEQ ID. NO. 122 
INKNEERRSIF 
SEQ ID. NO. 123 
NEERRSIFTRKP 
SEQ ID. NO. 124 
MRAQMNQI 
SEQ ID. NO. 125 
MRAQMNQIQS 
SEQ ID. NO. 126 
MRAQMNQIQSVEV 
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